
SPECIFICATION 

ftl , 3-N-ACETYL-D-GALACTOSAMINE TRANSFERASE PROTEIN, 
NUCLEIC ACID ENCODING THE SAME AND METHOD OF 
EXAMINING CANCERATION USING THE SAME 

5 

TECHNICAL FIELD 

The present invention relates to a novel |3l,3-N- 
acetyl-D-galactosaminyltransf erase protein and a nucleic 
acid encoding the same, as well as a canceration assay 

10 using the same, etc. 
BACKGROUND ART 

Recent attention has been focused on the in vivo 
roles of sugar chains and/or complex carbohydrates. For 
example, factors for determining blood types are 

15 glycoproteins, and it is glycolipids that are involved in 
the functions of the nervous system. Thus, enzymes having 
the ability to synthesize sugar chains constitute an 
extremely important key to analyzing physiological 
activities provided by various sugar chains. 

20 For example, N-acetyl-D-galactosamine (hereinafter 

also referred to as " GalNAc " ) is among the components 
constituting glycosaminoglycans , as well as being a sugar 
residue found in various sugar chain structures such as 
glycosphingolipids and mucin-type sugar chains. Thus, an 

2 5 enzyme transferring GalNAc will serve as an extremely 

important tool in analyzing the roles of sugar chains in 
various tissues in vivo. 

As described above, attention has been focused on the 
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in vivo roles of sugar chains, but it cannot be said that 
sufficient headway has been made in analyzing in vivo sugar 
chain synthesis. This is in part because the mechanism of 
sugar chain synthesis and the in vivo localization of sugar 
synthesis have not been fully analyzed • In analyzing the 
mechanism of sugar chain synthesis, it is necessary to 
analyze glycosylation enzymes (particularly 
glycosyltransf erases ) and to analyze what kind of sugar 
chains are synthesized by means of the enzymes. To this 
end, there is a strong demand for searching novel 
glycosyltransf erases and analyzing their functions. 

There are some reports of glycosyltransf erases having 
the ability to transfer GalNAc (Non-patent Documents 1 to 
4). For example, among human GalNAc transferases, enzymes 
transferring GalNAc with "|3l,4 linkage" are known (Non- 
patent Document 1) and enzymes using "galactose" as their 
acceptor substrate are known as enzymes transferring GalNAc 
with |3l,3 linkage (Non-patent Document 2) ("131,3" or "|33" 
as used herein refers to a glycosidic linkage between an 
a-hydroxyl group at the 1-position of a sugar residue in an 
acceptor substrate and a hydroxyl group at the 3 -position 
of a sugar residue to be transferred and linked thereto). 

On the other hand, in higher organisms like humans, 
no enzyme is known to transfer GalNAc with "(31,3 linkage" 
to "N-acetylglucosamine" (hereinafter also referred to as 
"GlcNAc") . 

Although there is a report showing that the sugar 
chain structure in which GalNAc and GlcNAc are linked in a 
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pi ,3 fashion was confirmed in sugar chains on neutral 
glycolipids of fly, a kind of arthropod (Non-patent 
Document 5 ) # it has been believed that such a sugar chain 
structure is not present in mammals, particularly in humans, 
to begin with. 
Patent Document 1 

International Patent Publication No. WO 01/79556 
Non-patent Document 1 

Cancer Res. 1993 Nov 15; 53 ( 22 ): 5395-400 : Yamashiro S # Ruan 
S, Furukawa K, Tai T , Lloyd KO, Shiku H, Furukawa K. Genetic 
and enzymatic basis for the differential expression of GM2 
and GD2 gangliosides in human cancer cell lines. 
Non-patent Document 2 

Biochim Biophys Acta. 1995 Jan 3; 1254 ( 1 ) : 56 -65 : Taga S, 
Tetaud C, Mangeney M, Tursz T, Wiels J. Sequential changes 
in glycolipid expression during human B cell, 
differentiation: enzymatic bases. 
Non-patent Document 3 

Proc Natl Acad Sci USA. 1996 Oct 1; 93 ( 20 ): 10697 -702 : 
Haslam DB , Baenziger JU. Related Articles, Links, 
Expression cloning of Forssman gly colipid synthetase: a 

9 

novel member of the histo-blood group ABO gene family. 
Non-patent Document 4 

J Biol Chem. 1997 Sep 19; 272(38): 23503-14: Wandall HH, 
Hassan H, Mirgorodskaya E, Kristensen AK, Roepstorff P, 
Bennett EP, Nielsen PA, Hollingsworth MA, Burchell J, 
Taylor-Papadimitriou J, Clausen H. Substrate specificities 
of three members of the human, UDP-N-acetyl-alpha-D- 
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galactosamine: Polypeptide N- 

acetylgalactosaminyltransf erase family, GalNAc-Tl, -T2, and 
-T3, 

Non -patent Document 5 
5 J. Biochem. (Tokyo) 1990 June; 107(6); 899-903: Sugita M. 
Inagaki F, Naito H, Hori T., Studies on glycosphingolipids 
in larvae of the green-bottle fly, Lucilia caesar: two 
neutral glycosphingolipids having large straight 
oligosaccaride chains with eight and nine sugars. 

10 DISCLOSURE OF THE INVENTION 

A problem to be solved by the present invention is to 
provide a polypeptide which is a mammal -derived 
(particularly human -derived) glycosyltransf erase and which 
has a novel transferase activity to transfer GalNAc with 

15 |3l,3 linkage to GlcNAc, as well as a nucleic acid encoding 
such a polypeptide, etc. 

Another problem to be solved by the present invention 
is to provide a transformant expressing the nucleic acid in 
host cells, a method for producing the encoded protein by 

2 0 allowing the transformant to produce the protein and then 
collecting the protein, and an antibody recognizing the 
protein . 

On the other hand, since sugar chain synthesis may be 
affected by canceration, the identification and expression 
2 5 analysis of such a glycosylation enzyme can be expected to 
provide an index useful for cancer diagnosis, etc. The 
present invention also provides detailed procedures and 
criteria useful for canceration assay or the like by 



) • . ) 

analyzing and comparing, at the tissue or cell line level, 
the transcription level of such a protein which varies in 
correlation with cancelation or malignancy. 
BRIEF DESCRIPTION OF DRAWINGS 
5 Figure 1 is a diagram showing changes in the activity 

of the G34 enzyme protein according to this example, 
plotted against the reaction time. 

Figure 2A shows the results of NMR measurement , used 
for analysis of the sugar chain structure synthesized by 
10 the G34 enzyme protein according to this example. 

Figure 2B shows a partial magnified view of the NMR 
results in Figure 2A. 

Figure 3 is a table summarizing NOE in NMR shown in 
Figure 2. Various conditions for the data in Table 1 are 
15 as follows: 1.08 mM, 298K, D 2 0, CH 2 (high) = 4.557 ppm for 
non-marked data, chemical shifts for data marked with * 
are CH 2 (low) = 4.778 ppm, phenyl( ortho ) = 7.265 ppm, 
phenyl(meta) = 7.354 ppm and phenyl(para) = 7.320 ppm, 
calculated from the ID spectrum. 
20 Figure 4 is a table summarizing relevant data 

(tentative NOE) for each pyranose with respect to NMR shown 
in Figure 2 (s: strong, m: medium, w: weak, vw: very weak, 
A: GlcNAc, B: GalNAc). 

Figure 5 shows a comparison of amino acid sequences 
25 between G34 enzyme protein according to this example and 
known |33Gal transferases. 

Figure 6 shows a comparison of motifs involved in the 
(33-linking activity between G34 enzyme protein according to 
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this example and various known (33 -linking 

glycosyltransf erases . "b3" represents a (31-3 linkage and 
w Gn" represents GicNAc . 

Figure 7 is a diagram showing the pH dependence of 
5 the activity of the G34 enzyme protein according to this 
example . 

Figure 8 is a diagram showing ion requirement for the 
activity of the G34 enzyme protein according to this 
example . 

10 Figure 9 presents graphs showing the expression 

levels of the G34 enzyme protein according to this example 
in human cell lines . 

Figure 10 shows amino acid sequence alignment between 
mouse G34 according to this example (upper) and human G34 

15 ( lower) . 

Figure 11 shows the result of in situ hybridization 
performed on a mouse testis sample using the mG34 nucleic 
acid according to this example. 
DETAILED DESCRIPTION OF THE INVENTION 

20 To solve the problems stated above, the inventors of 

the present invention have attempted to isolate and purify 
a nucleic acid of interest, which may have high sequence 
identity, on the basis of the nucleotide sequence of an 
enzynys gene functionally similar to the intended enzyme. 

25 More specifically, first, the sequence of a known 

glycosyltransf erase (33 galactosyltransf erase 6 (|33GalT6) 
was used as a query for a BLAST search to thereby find a 
sequence with homology (GenBank No. AX285201). It should 
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be noted that this nucleotide sequence was known as the 
sequence of SEQ ID NO: 1006 disclosed in International 
Publication No. WO 01/79556 (Patent Document 1 listed 
above), but its activity remained unknown. 
5 First, the inventors of the present invention have 

independently cloned the above gene by PCR, have determined 
its nucleotide sequence (SEQ ID NO: 1) and putative amino 
acid sequence (SEQ ID NO: 2), and have succeeded in 
identifying a certain biological activity of a polypeptide 
10 encoded by the nucleic acid, thus completing the present 

invention. Moreover, when using the sequence as a query to 
search mouse genes, the inventors have found the nucleotide 
sequence of SEQ ID NO: 3 and its putative amino acid 
sequence (SEQ ID NO: 4). 
15 The gene having the nucleotide sequence of SEQ ID NO: 

1 and the protein having the amino acid sequence of SEQ ID 
NO: 2 were designated human G34, while the gene having the 
nucleotide sequence of SEQ ID NO: 3 and the protein having 
the amino acid sequence of SEQ ID NO: 4 were designated 
20 mouse G34. 

According to the studies of the inventors, the above 
G34 protein uses an N-acetyl-D-galactosamine residue as a 
donor substrate and an N-acetyl-D-glucosamine residue as an 
acceptor substrate. As detailed later in Example 2, the 
25 G34 protein was found to retain three motifs in its amino 
acid sequence, which are well conserved in the enzyme 
family transferring various sugars (e.g., galactose, 
N-acetyl-D-glucosamine) in the linking mode of (31,3. In 
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light of these points, the G34 protein was unexpectedly 
believed to have transferase activity to synthesize a novel 
sugar chain structure v 'GaiNAc-|3l , 3-GlcNAc , " for which no 
report has been made for mammals, particularly humans. The 
linking mode was actually confirmed by NMR . 

Namely, the present invention relates to a |3l,3-N- 
acetyl-D-galactosaminyltransf erase protein which transfers 
N - acetyl - D - galact o s amine to N-acetyl-D-glucosamine with 
(31,3 linkage. 

An enzyme protein according to a preferred embodiment 
of the present invention may have at least one or any 
combination of the following properties (a) to (c). 
(a) Acceptor substrate specificity 

When using an oligosaccharide as an acceptor 
substrate, the enzyme protein shows transferase activity 
toward Bz-p-GlcNAc, GlcNAc-|3l-4-GlcNAc-|3-Bz , Gal-|3l-3 
(GlcNAc-pi-6) GalNAc-ct-pNp, GlcNAc-|3l-3 GalNAc-a-pNp and 
GlcNAc-|3l-6GalNAc-a-pNp ("GlcNAc" represents an N-acetyl-D- 
glucosamine residue, "GalNAc" represents an N-acetyl-D- 
galactosamine residue, "Bz" represents a benzyl group, 
M pNp" represents a p-nitrophenyl group, and w -" represents 
a glycosidic linkage. Numbers in these formulae each 
represent the carbon number in the sugar ring where a 
glycosidic linkage is present, and "a" and "|3" represent 
anomers of the glycosidic linkage at the 1 -position of the 
sugar ring. An anomer whose positional relationship with 
CH 2 OH or CH 3 at the 5-position is trans and cis is 
represented by M a" and "(3", respectively). 
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Preferably, the enzyme protein is substantially free 
from transferase activity toward Bz-a-GlcNAc and Gal pi -3 
GicNAc-p-pNp . 

(b) Reaction pH 

5 The activity is lower in a pH range of 6.2 to 6.6 

than in other pH ranges . 

(c) Divalent ion requirement 

Although the above activity is enhanced at least in 
the presence of Mn 2+ , Co 2+ or Mg 2+ , the Mn 2+ - induced 
10 enhancement of the activity is almost completely eliminated 
in the presence of Cu 2+ . 

Moreover, in a preferred embodiment of the above 
glycosyltransf erase protein, the glycosyltransf erase 
protein of the present invention comprises the following 
15 polypeptide (A) or (B) : 

(A) a polypeptide which has the amino acid sequence shown 
in SEQ ID NO: 2 or 4; or 

(B) a polypeptide which has an amino acid sequence with 
substitution, deletion or insertion of one or more amino 

20 acids in the amino acid sequence shown in SEQ ID NO: 2 or 4 
and which transfers N-acetyl-D-galactosamine to N-acetyl-D- 
glucosamine with (31,3 linkage. 

Moreover, in a more preferred embodiment of the above 
glycosyltransf erase protein, the above polypeptide (A) is a 

25 glycosyltransf erase protein consisting of a polypeptide 

having an amino acid sequence covering amino acids 189 to 
500 shown in SEQ ID NO: 2. Likewise, in an even more 
preferred embodiment of the above glycosyltransf erase 



protein, the above polypeptide (A) is a glycosyltransf erase 
protein consisting of a polypeptide having an amino acid 
sequence covering amino acids 36 to 500 shown in SEQ ID NO: 

2 . 

5 In addition, other embodiments of the 

glycosyltransf erase protein of the present invention 
encompass proteins consisting of polypeptides having amino 
acid sequences sharing at least more than 30% identity, 
preferably at least 40% identity, and more preferably at 
10 least 50% identity with an amino acid sequence covering 

amino acids 189 to 500 shown in SEQ ID NO: 2 or amino acids 

3 5 to 504 shown in SEQ ID NO: 4. 

In another aspect, the present invention provides a 
nucleic acid consisting of a nucleotide sequence encoding 

15 any one of the above polypeptides or a nucleotide sequence 
complementary thereto. 

In a preferred embodiment, the nucleic acid encoding 
the protein of the present invention is a nucleic acid 
consisting of the nucleotide sequence shown in SEQ ID NO: 1 

20 or 3 or a nucleotide sequence complementary to at least one 
of them. More preferably, in the case of human origin, 
such a nucleic acid consists of a nucleotide sequence 
covering nucleotides 565 to 1503 shown in SEQ ID NO: 1 or a 
nucleotide sequence complementary thereto, and most 

25 preferably consists of a nucleotide sequence covering 
nucleotides 106 to 1503 shown in SEQ ID NO: 1 or a 
nucleotide sequence complementary thereto. In the case of 
mouse origin, such a nucleic acid consists of a nucleotide 
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sequence covering nucleotides 103 to 1512 shown in SEQ ID 
NO: 3 or a nucleotide sequence complementary thereto. 

Embodiments of the above nucleic acids according to 
the present invention encompass DNA. 
5 The present invention further provides a vector 

carrying any one of the above nucleic acids and a 
transformant containing the vector. 

In yet another aspect, the present invention provides 
a method for producing a pi , 3-N-acetyl-D- 
10 galactosaminyltransf erase protein, which comprises growing 
the above transformant to express the above 
glycosyltransf erase protein and collecting the 
glycosyltransf erase protein from the grown transformant. 

In yet another aspect, the present invention provides 
15 an antibody recognizing any one of the above 01 , 3-N-acetyl- 
D- galactosaminyltransf erase proteins . 

On the other hand, in response to the discovery of 
the above G34, the inventors of the present invention have 
clarified that the expression level of G34 mRNA is 
20 increased significantly in cancerous tissues and cell lines. 

Thus, the present invention also provides a nucleic 
acid for measurement, which is useful as an index of 
canceration or malignancy and which hybridizes under 
stringent conditions to the nucleotide sequence shown in 
2 5 SEQ ID NO: 1 or 3 or a nucleotide sequence complementary to 
at least one of them. 

The nucleic acid for measurement of the present 
invention may typically consist of a nucleotide sequence 
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covering at least a dozen contiguous nucleotides in the 
nucleotide sequence shown in SEQ ID NO: 1 or 3 or a 
nucleotide sequence complementary thereto. 

In a preferred embodiment, the nucleic acid for 
measurement of the present invention encompasses a probe 
consisting of the nucleotide sequence shown in SEQ ID NO: 
16 or a nucleotide sequence complementary thereto, as well 
as a primer set consisting of the following nucleotide 
sequences (1) or (2): 

(1) a pair of the nucleotide sequences shown in SEQ ID 
NOs : 14 and 15; or 

(2) a pair of the nucleotide sequences shown in SEQ ID 
NOs: 17 and 18. 

Also, the nucleic acid for measurement of the present 
invention may be used as a tumor marker. 

The present invention further provides a method for 
assaying canceration in a biological sample, which 
comprises: 

(a) using any one of the above nucleic acids to measure 
the transcription level of the nucleic acid in the 
biological sample; and 

(b) determining whether the measured value is 
significantly higher than that of a normal biological 
sample . 

In a preferred embodiment, the canceration assay of 
the present invention includes cases where . the measurement 
of the transcription level is made by hybridization or PCR 
targeted at the above biological sample and using any one 



of the above nucleic acids . 

In a further aspect of the canceration assay of the 
present invention, the present invention provides a method 
for assaying the effectiveness of treatment in cancer 
5 therapy, which comprises using any one of the above nucleic 
acids to measure the transcription level of the nucleic 
acid in a biological sample treated by cancer therapy, and 
determining whether the measured value is significantly 
lower than that obtained before treatment or than that of 
10 an untreated sample. 

In particular, the above biological sample may be 
derived from the large intestine (colon) or lung. 
MODE FOR CARRYING OUT THE INVENTION 

The mode for carrying out the present invention will 
15 be described in detail below. 

(1) Nucleic acid encoding the G34 enzyme protein of the 
present invention 

Based upon the above discovery, the inventors of the 
present invention expressed the G34 enzyme protein encoded 
20 by the nucleic acid, isolated and purified the protein, and 
further identified its enzymatic activity. When focusing 
on the fact that an amino acid sequence having the desired 
enzymatic activity was identified, the nucleotide sequence 
of SEQ ID NO: 1 or 3 is one embodiment of a nucleic acid 
25 encoding the isolated polypeptide having the enzymatic 

activity. This means that the nucleic acid of the present 
invention encompasses all, but a limited number of , nucleic 
acids having degenerate nucleotide sequences capable of 
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encoding the same amino acid sequence for the G34 enzyme 
protein. 

The present invention also provides a nucleic acid 
encoding the full-length or a fragment of a polypeptide 
5 consisting of a novel amino acid sequence as mentioned 
above. A typical nucleic acid encoding such a novel 
polypeptide may have the nucleotide sequence shown in SEQ 
ID NO: 1 or 3 or a nucleotide sequence complementary to at 
least one of them. 

10 The nucleic acid of the present invention also 

encompasses both single-stranded and double- stranded DNA 
and their complementary RNA. Examples of DNA include 
naturally- occurring DNA , recombinant DNA, chemically- bound 
DNA, PCR-amplif ied. DNA, and combinations thereof. However, 

15 DNA is preferred in terms of stability during vector and/or 
transf ormant preparation . 

The nucleic acid of the present invention may be 
prepared in the following manner, by way of example. 

First, the known sequence under GenBank No. AX285201 

20 or a part thereof may be used to perform nucleic acid 

amplification on a cDNA library in a routine manner using 
basic procedures for genetic engineering (e.g., 
hybridization, nucleic acid amplification), thereby cloning 
the nucleic acid of the present invention. Since the 

25 nucleic acid may be obtained, e.g., as a DNA fragment of 

approximately 1.5 kbp as a PCR product, the fragment may be 
separated using techniques for screening DNA fragments 
based on their molecular weight (e.g., agarose gel 
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electrophoresis) and isolated in a routine manner, e.g. 
using techniques for excising a specific band. 

Moreover, according to the putative amino acid 
sequence (SEQ ID NO: 2 or 4) of the isolated nucleic acid, 
5 the nucleic acid may be estimated to have a hydrophobic 

transmembrane region at its N- terminal end. By preparing a 
region of a nucleotide sequence encoding a polypeptide free 
from this transmembrane region, it is also possible to 
obtain the nucleic acid of the present invention that 

10 encodes a soluble form of the polypeptide. 

Based on the nucleotide sequence of the nucleic acid 
disclosed herein, it is easy for those skilled in the art 
to create appropriate primers from nucleotide sequences 
located at both ends of a nucleic acid of interest or a 

15 region thereof to be prepared and to use the primers thus 
created for nucleic acid amplification to amplify and 
prepare the region of interest. 

The above nucleic acid amplification includes, for 
example, reactions requiring thermal cycling such as 

20 polymerase chain reaction (PCR) [Saiki R.K., et al.. 

Science, 230, 1350-1354 (1985)], ligase chain reaction 
(LCR) [Wu D. Y., et al . , Genomics, 4, 560-569 (1989); 
Barringer K. J., et al . , Gene, 89, 117-122 (1990); Barany 
F . , Proc. Natl. Acad. Sci . USA, 88, 189-193 (1991)] and 

25 transcription-based amplification [Kwoh D. Y., et al. # Proc. 
Natl. Acad. Sci. USA, 86, 1173-1177 (1989)], as well as 
isothermal reactions such as strand displacement 
amplification (SDA) [Walker G. T., et al., Proc. Natl. Acad. 



) 



Sci. USA, 89, 392-396 (1992); Walker G. T., et al . , Nuc. 
Acids Res., 20, 1691-1696 (1992)], self -sustained sequence 
replication (3SR) [Guatelli J. C. , Proc. Natl. Acad. Sci. 
USA, 87, 1874-1878 (1990)] and Q|3 replicase system [Lizardi 
et al., BioTechnology 6, p. 1197-1202 (1988)]. It is also 
possible to use other reactions, e.g., nucleic acid 
sequence-based amplification (NASBA) through competitive 
amplification between a target nucleic acid and a mutated 
sequence, found in European Patent No. 0525882. Preferred 
is PCR. 

The use of the nucleic acid of the present invention 
also enables the expression of the intended enzyme protein 
or the provision of probes and antisense primers for the 
purpose of medical research or gene therapy, as described 
later. 

Those skilled in the art will be able to obtain a 
nucleic acid as useful as the sequence of SEQ ID NO: 1 or 3 
by preparing a nucleic acid consisting of a nucleotide 
sequence sharing a certain homology with the nucleotide 
sequence of SEQ ID NO: 1 or 3 . For example, the homologous 
nucleic acid of the present invention encompasses nucleic 
acids encoding proteins which share homology with the amino 
acid sequence shown in SEQ ID NO: 2 or 4 and which have the 
ability to transfer N-acetyl-D-galactosamine to N-acetyl-D- 
glucosamine with (31,3 linkage. 

To identify the range of nucleic acids encoding such 
homologous proteins according to the present invention, an 
identity search is performed for the nucleic acid sequence 
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shown in SEQ ID NO: 1 or 3 of the present invention, 
indicating that the nucleic acid sequence shares 40% 
identity with the nucleic acid sequence of a known 
|3l,4GalNAc transferase showing the highest homology 
5 (Non-patent Document 1 listed above) and also shares 40% 
identity with the nucleic acid sequence of a known |3l,3Gal 
transferase showing the highest homology (Non-patent 
Document 2 listed above). In light of these points, a 
preferred nucleic acid sequence encoding the homologous 

10 protein of the present invention typically shares more than 
40% identity, more preferably at least 50% identity, and 
particularly preferably at least 60% identity with any one 
of the entire nucleotide sequence of SEQ ID NO: 1 or 3, 
preferably a partial nucleotide sequence consisting of 

15 nucleotides 106 to 1503 in SEQ ID NO: 1, preferably a 

partial nucleotide sequence consisting of nucleotides 103 
to 1512 in SEQ ID NO: 3, or nucleotide sequences 
complementary to these sequences. 

Likewise, the nucleotide sequences shown in SEQ ID 

20 NOs: 1 and 3 share 86% identity with each other. In light 
of this point, a preferred nucleic acid sequence encoding 
the homologous protein of the present invention can be 
defined as sharing at least 86%, preferably 90% identity 
with any one of the entire nucleotide sequence of SEQ ID 

25 NO: 1, preferably nucleotides 106 to 1503, or a nucleotide 
sequence complementary thereto. 

The above percentage of identity may be determined by 
visual inspection and mathematical calculation. 



Alternatively, the percentage of identity between two 
nucleic acid sequences may be determined by comparing 
sequence information using the GAP computer program, 
version 6.0, described by Devereux et al., Nucl . Acids Res. 
5 12: 387, 1984 and available from the University of 

Wisconsin Genetics Computer Group (UWGCG). The preferred 
default parameters for the GAP program include: (1) a unary 
comparison matrix (containing a value of 1 for identities 
and 0 for non-identities) for nucleotides, and the weighted 

10 comparison matrix of Gribskov and Burgess, Nucl. Acids Res. 
14:6745, 1986, as described by Schwartz and Dayhoff, eds . , 
Atlas of Protein Sequence and Structure, pp. 353-358, 
National Biomedical Research Foundation, 1979; (2) a 
penalty of 3.0 for each gap and an additional 0.10 penalty 

15 for each symbol in each gap; and (3) no penalty for end 

caps. It is also possible to use other sequence comparison 
programs used by those skilled in the art. 

Other nucleic acids homologous as the structural gene 
of the present invention typically include nucleic acids 

2 0 which hybridize under stringent conditions to a nucleotide 
consisting of a nucleotide sequence within SEQ ID NO: 1 or 
3, preferably a nucleotide sequence consisting of 
nucleotides 106 to 1503 of SEQ ID NO: 1, preferably a 
nucleotide sequence consisting of nucleotides 103 to 1512 

2 5 of SEQ ID NO: 3, or a nucleotide sequence complementary 

thereto and which encode polypeptides having the ability to 
transfer N-acetyl-D-galactosamine to N-acetyl-D-glucosamine 
with (31,3 linkage. 
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As used herein, "under stringent conditions" means 
that a nucleic acid hybridizes under conditions of moderate 
or high stringency. More specifically, conditions of 
moderate stringency may readily be determined by those 
5 having ordinary skill in the art, e.g., depending on the 

length of DNA. Primary conditions can be found in Sambrook 
et al. , Molecular Cloning: A Laboratory Manual, 3rd edition, 
Vol. 1, 7.42-7.45 Cold Spring Harbor Laboratory Press, 2001 
and include the use of a prewashing solution for 

10 nitrocellulose filters 5 x SSC, 0.5% SDS, 1.0 mM EDTA (pH 
8.0), hybridization conditions of about 50% formamide, 2 x 
SSC to 6 x SSC at about 40-50°C (or other similar 
hybridization solutions, such as Stark's solution, in about 
50% formamide at about 42°C) and washing conditions of 

15 about 60°C, 0.5 x SSC, 0.1% SDS. Conditions of high 

stringency can also be readily determined by those skilled 
in the art, e.g., depending on the length of DNA. In 
general, such conditions include hybridization and/or 
washing at a higher temperature and/or at a lower salt 

20 concentration than that required under conditions of 
moderate stringency and, for example, are defined as 
hybridization conditions as above and with washing at about 
6 8°C, 0.2 x SSC, 0.1% SDS. Those skilled in the art will 
recognize that the temperature and washing solution salt 

25 concentration can be adjusted as necessary according to 
factors such as the length of nucleotide sequences. 

As described above, those skilled in the art will 
readily determine and achieve conditions of suitably 



moderate or high stringency on the basis of common 
knowledge about hybridization conditions which are known in 
the art, as well as on the empirical rule which will be 
obtained through commonly used experimental means. 
5 (2) Vector and transformant of the present invention 

The present invention provides a recombinant vector 
carrying the above nucleic acid. Procedures for 
integrating a DNA fragment of the nucleic acid into a 
vector (e.g. , a plasmid) include those described in 
10 Sambrook, J. et al.. Molecular Cloning, A Laboratory Manual 
(3rd edition). Cold Spring Harbor Laboratory, 1.1 (2001). 
For convenience, a commercially available ligation kit 
(e.g. , a product of TaKaRa Shuzo Co. , Ltd. , Japan) may be 
used. 

15 The recombinant vector (e.g., recombinant plasmid) 

thus obtained may be introduced into host cells (e.g., E. 
coll DH5a, TBI, LE392, or XL-LE392 or XL-lBlue). 
Procedures for introducing the plasmid into host cells 
include those described in Sambrook, J. et al.. Molecular 

20 Cloning, A Laboratory Manual (3rd edition). Cold Spring 

Harbor Laboratory, 16.1 (2001), exemplified by the calcium 
chloride method or the calcium chloride/rubidium chloride 
method, electroporation , electrbin jection , chemical 
treatment (e.g., PEG treatment), and the gene gun method. 

25 A vector which can be used may be prepared readily by 

linking a desired gene to a recombination vector available 
in the art (e.g., plasmid DNA) in a routine manner. 
Specific examples of a vector to be used include, but are 



not limited to, E . coli-derived plasmids such as pDONR201, 
pBluescript, pUC18, pUC19 and pBR322 . 

Those skilled in the art will be able to select 
appropriate restriction ends to fit into the intended 
5 expression vector. The expression vector may be selected 
appropriately by those skilled in the art such that the 
vector is suitable for host cells where the enzyme of the 
present invention is to be expressed. Moreover, the 
expression vector is preferably constructed to allow 

10 regions involved in gene expression (e.g., promoter region, 
enhancer region and operator region) to be properly located 
to ensure expression of the above nucleic acid in target 
host cells, so that the nucleic acid is properly expressed. 
The type of expression vector is not limited in any 

15 way as long as the vector allows expression of a desired 
gene in various prokaryotic and/or eukaryotic host cells 
and has the function of producing a desired protein. 
Preferred examples include pQE-30, pQE-60, pMAL-C2, pMAL-p2 
and pSE420 for E. coll expression, pYES2 ( Saccharomyces ) 

20 and pPIC3.5K, pPIC9K and pA0815 (all Pichia) for yeast 
expression, as well as pFastBac, pBacPAK8/9, pBK283, 
pVL1392 and pBlueBac4.5 for insect expression. 

To construct the expression vector, a Gateway system 
(Invitrogen Corporation) may be used which does not require 

25 restriction treatment and ligation operation. The Gateway 
system is a site- specif ic recombination system which allows 
cloning while maintaining the orientation of PCR products 
and also allows subcloning of a DNA fragment into a 
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properly modified expression vector. More specifically, 
this system prepares an expression clone corresponding to 
the intended expression system by creating an entry clone 
from a PCR product and a donor vector by the action of a 
5 site- specif ic recombinase BP clonase and then transferring 
the PCR product to a destination vector which allows 
recombination with this clone by the action of another 
recombinase LR clonase. One feature of this system is that 
a time- and labor-consuming subcloning step which requires 

10 treatment with restriction enzymes and/or ligases can be 
eliminated when an entry clone is created to begin with. 

The above expression vector carrying the nucleic acid 
of the present invention may be integrated into host cells 
to give a transformant for producing the polypeptide of the 

15 present invention. In general, host cells used for 

obtaining the transformant may be either eukaryotic cells 
(e.g., mammalian cells, yeast, insect cells) or prokaryotic 
cells (e.g., E . coli , Bacillus subtills) . Also, cultured 
cells of human origin (e.g., HeLa, 293T, SH-SY5Y) or mouse 

20 origin (e.g., Neuro2a, NIH3T3) may be used for this purpose. 
All of these host cells are known and commercially 
available (e.g., from Dainippon Pharmaceutical Co., Ltd., 
Japan), or available from public research institutions 
(e.g., RIKEN Cell Bank). Alternatively, it is also 

25 possible to use embryos, organs, tissues or non-human 
individuals . 

Since the nucleic acid of the present invention was 
found from human genome libraries, it is believed that when 
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eukaryotic cells are used as host cells, the G34 enzyme 
protein of the present invention may have properties close 
to native proteins (e.g., embodiments where glycosylation 
occurs). In light of this point, it is preferable to 
select eukaryotic cells, particularly mammalian cells, as 
host cells . Specific examples of mammalian cells include 
animal cells of mouse, Xenopus laevls , rat, hamster, monkey 
or human origin or cultured cell lines established from 
these cells. E. coll , yeast or insect cells available for 
use as host cells are specifically exemplified by E. coll 
(e.g., DH5a, M15, JM109, BL21), yeast (e.g., INVScl 
(Saccharomyces) , GS115, KM71 (both Pichia) ) or insect cells 
(e.g., Sf21, BmN4, silkworm larva). 

In general, an expression vector can be prepared by 
linking at least a promoter, an initiation codon, a gene 
encoding a desired protein, a termination codon and a 
terminator region to an appropriate replicable unit to give 
a continuous loop. In this case, if desired, it is also 
possible to use an appropriate DNA fragment (e.g., linkers, 
other restriction enzyme sites) through routine techniques 
such as digestion with a restriction enzyme and/or ligation, 
using T4 DNA ligase. When bacterial (particularly E. coll) 
cells are used as host cells, an expression vector is 
generally composed of at least a promoter/operator region, 
an initiation codon, a gene encoding a desired protein, a 
termination codon, a terminator and a replicable unit. 
When yeast cells, plant cells, animal cells or insect cells 
are used as host cells, it is generally preferred that an 



expression vector comprises at least a promoter, an 
initiation codon, a gene encoding a desired protein, a 
termination codon and a terminator. In this case, the 
vector may also comprise DNA encoding a signal peptide, an 
5 enhancer sequence, 5'- and 3 '-terminal untranslated regions 
of the desired gene, a selective marker region or a 
replicable unit, as appropriate. 

A replicable unit refers to DNA having the ability to 
replicate its entire DNA sequence in host cells and 

10 includes a native plasmid, an artificially modified plasmid 
(i.e., a plasmid prepared from a native plasmid) and a 
synthetic plasmid. Examples of a preferred plasmid include 
plasmid pQE30, pET or pCAL or an artificially modified 
product thereof (i.e., a DNA fragment obtained from pQE30, 

15 pET or pCAL by treatment with an appropriate restriction 
enzyme) for E. coli cells, plasmid pYES2 or pPIC9K for 
yeast cells, as well as plasmid pBacPAK8/9 for insect cells. 

A methionine codon (ATG) may be given as an example 
of an initiation codon preferred for the vector of the 

20 present invention. Examples of a termination codon include 
commonly used termination codons (e.g., TAG, TGA, TAA) . As 
for enhancer and terminator sequences, it is also possible 
to use those commonly used by those skilled in the art, 
such as SV40-derived enhancer and terminator sequences. 

25 As a selective marker, a commonly used one can be 

used in a routine manner. Examples include antibiotic 
resistance genes such as those resistant to tetracycline, 
ampicillin, or kanamycin or neomycin, hygromycin or 
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spectinomycin . 

The introduction (also referred to as transformation 
or transf ection) of the expression vector according to the 
present invention into host cells may be accomplished by 
5 using conventionally known techniques. Transformation may 
be accomplished, for example, by the method of Cohen et al. 
[Proc. Natl. Acad. Sci. USA, 69, 2110 (1972)], the 
protoplast method [Mol. Gen. Genet., 168, 111 (1979)] or 
the competent method [J. Mol. Biol., 56, 209 (1971)] for 

10 bacterial cells (e.g., E. coll, Bacillus subtllls) and by 

the method of Hinnen et al . [Proc. Natl. Acad. Sci. USA, 75, 
1927 (1978)] or the lithium method [J. B. Bacteriol . , 153, 
163 (1983)] for Saccharomyces cerevlsiae. Transformation 
may also be accomplished, for example, by the leaf disk 

15 method [Science, 227, 129 (1985)] or electroporation 

[Nature, 319, 791 (1986)] for plant cells, by the method of 
Graham et al . [Virology, 52, 456 (1973)] for animal cells, 
and by the method of Summer et al. [Mol. Cell Biol., 3, 
2156-2165 (1983)] for insect cells. 

20 (3) G34 enzyme protein of the present invention 

As illustrated in the Example section described later, 
a polypeptide having a novel enzymatic activity can be 
isolated and purified, for example, by integrating a 
nucleic acid having the nucleotide sequence of SEQ ID NO: 1 

25 or 3 into an expression vector and then expressing the 
nucleic acid. 

First, in light of the above point, a typical 
embodiment of the protein of the present invention is an 
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isolated G34 enzyme protein consisting of the putative 
amino acid sequence shown in SEQ ID NO: 2 or 4 . More 
specifically, this enzyme protein has the activities shown 
below. 
5 Catalytic reaction 

The enzyme protein allows transfer of "N-acetyl-D- 
galactosamine ( GalNAc ) " from its donor substrate to an 
acceptor substrate containing "N-acetyl-D- glucosamine 
(GlcNAc)." Examination of motif sequences in the amino 

10 acid sequence indicates that the linking mode between 

N- acetylgalactosamine and N-acetylglucosamine is a (31,3 
glycosidic linkage (see Example 2). 
Donor substrate specificity : 

The above N-acetyl-D-galactosamine donor substrate 

15 encompasses sugar nucleotides having N- acetylgalactosamine, 
such as uridine diphosphate-N-acetylgalactosamine (UDP- 
GalNAc) , adenosine diphosphate-N-galactosamine ( ADP- GalNAc ) , 
guanos ine diphosphate -N- acetylgalactosamine (GDP -GalNAc) 
and cytidine diphosphate-N-acetylgalactosamine (CDP-GalNAc) . 

20 A typical donor substrate is UDP-GalNAc. 

Namely, the G34 enzyme protein of the present 
invention catalyzes a reaction of the following scheme: 
UDP-GalNAc + GlcNAc-R UDP + GalNAc-pl , 3 -GlcNAc-R 
(wherein R represents, e.g., a . glycoprotein , glycolipid, 

25 oligosaccharide or polysaccharide having the GlcNAc 
residue) . 

Acceptor substrate specificity : 

An acceptor substrate of the above GalNAc is 
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N-acetyl-D-glucosamine, typically an N-acetyl-D-glucosamine 
residue of glycoproteins, glycolipids, oligosaccharides or 
.polysaccharides, etc. 

When using an oligosaccharide as an acceptor 
5 substrate, the human G34 protein obtained in Example 1 

described later (typically having a region covering amino 
acid 36 to the C-terminal end of SEQ ID NO: 2) shows, 
transferase activity toward Bz-|3-GlcNAc, GlcNAc-pi-4- 
GlcNAc-|3-Bz , pNp-core2 (core2 = Gal-pi-3- (GlcNAc-pl-6) 

10 GalNAc-cx-pNp; the same applying hereinafter), pNp-core3 
(core3 = GlcNAc-pi-3 GalNAc-a-pNp ; the same applying 
hereinafter) and pNp-core6 (core6 = GlcNAc-pi-6-GalNAc-a- 
pNp; the same applying hereinafter). Preferably, the human 
G34 protein is free from transferase activity toward Bz-a- 

15 GlcNAc and Gal-|3l-3 GlcNAc-(3-pNp . Moreover, when the 
activity is compared between these substrates, the 
transferase activity is very high in transferring to pNp- 
core2 and Bz-p-GlcNAc, particularly highest in transferring 
to pNp-core2. The transferase activity is relatively low 

20 in transferring to GlcNAc-pi-4-GlcNAc-p-Bz , pNp-core3 and 
pNp-core6 . 

Likewise, the mouse G34 protein obtained in Example 4 
described later (typically having an active region covering 
amino acid 35 to the C-terminal end of SEQ ID NO: 4) shows 
25 transferase activity toward Bz-p-GlcNAc, pNp-p-Glc, GlcNAc- 
pi -4-GlcNAc-p-Bz , pNp-core2, pNp-core3 and pNp-core6. When 
the activity is compared between these substrates, the 
transferase activity is highest in transferring to Bz-p- 



GlcNAc, followed by core2-pNp, core6-pNp, core3-pNp, pNp-|3- 
Glc and GlcNAc-|3l-4-GlcNAc-[3-Bz in the order named. 

As used herein, "GlcNAc" represents an N-acetyl-D- 
glucosamine residue, "GalNAc" represents an N-acetyl-D- 
5 galactosamine residue, "Glc" represents a glucosamine 

residue, "Bz" represents a benzyl group, M pNp" represents a 
p-nitrophenyl group, "oNp" represents a o-nitrophenyl group, 
and "-" represents a glycosidic linkage. Numbers in these 
formulae each represent the carbon number in the sugar ring 

10 where the above glycosidic linkage is present. Likewise, 
"a" and "(3" represent anomers of the above glycosidic 
linkage at the 1 -position of the sugar ring. An anomer 
whose positional relationship with CH 2 OH or CH 3 at the 
5-position is trans and els is represented by "a" and "|3", 

15 respectively. 

Optimum buffer and optimum pH (Table 3 and Figure 4); 

Examination of the human G34 protein indicates that 
the protein has the above catalytic effect in each of the 
following optimum buffers: MES ( 2-morpholinoethanesulf onic 

20 acid) buffer, sodium cacodylate buffer or HEPES (N-[2- 

hydroxyethl]piperazine-N' - [ 2-ethanesulf onic acid] ) buffer. 

The pH dependence of the activity in each buffer is 
as follows: in MES buffer, the activity is highest around a 
pH of at least 5.50 to 5.78 and second highest around pH 

25 6.75; in sodium cacodylate buffer, the activity increases 
with decrease in pH from around 6.2 to around 5.0 and is 
highest around pH 5.0, while the activity also increases in 
a pH-dependent manner between around pH 6.2 and 7.0 and 
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nearly plateaus around pH 7.4; and in HEPES buffer, the 
activity is highest around a pH of 7 . 4 to 7 . 5 . Among them, 
HEPES buffer at a pH of about 7.4 to about 7.5 results in 
the strongest activity. In all the buffers, the activity 
is lower in a pH range of 6.2 to 6.6 than in other pH 
ranges . 

Divalent ion requirement (Table 4 and Figure 5) ; 

The activity of the human G3 4 protein is enhanced in 
the presence of a divalent metal ion, particularly Mn 2+ , 
Co 2+ or Mg 2+ . The influence of each metal ion concentration 
on the activity is as follows: in the case of Mn 2+ and Co 2+ , 
the activity increases in a concentration -dependent manner 
up to around 5.0 nM and then nearly plateaus at higher 
concentrations, while in the case of Mg 2+ , the activity 
increases in a concentration-dependent manner up to around 
2.5 nM and then nearly plateaus at higher concentrations. 
However, the Mn 2+ - induced enhancement of the activity is 
completely eliminated in the presence of Cu 2+ . 

As described above, the G34 enzyme protein of the 
present invention can transfer a GalNAc residue to a GlcNAc 
residue with (31-3 glycosidic linkage under given enzymatic 
reaction conditions as mentioned above and is useful for 
such sugar chain synthesis or modification reactions 
targeted at glycoproteins, glycolipids, oligosaccharides or 
polysaccharides , etc . 

Secondly, having disclosed herein the amino acid 
sequences shown in SEQ ID NOs : 2 and 4 which are given as 

m 

typical examples of the primary structure of the above 



enzyme protein, the present invention provides all proteins 
which can be produced on the basis of these amino acid 
sequences through genetic engineering procedures well known 
in the art (hereinafter also referred to as "mutated 
5 proteins" or "modified proteins"). Namely, according to 
common knowledge in the art, the enzyme protein of the 
present invention is not limited only to a protein 
consisting of the amino acid sequence of SEQ ID NO: 2 or 4 
estimated from the nucleotide sequence of each cloned 

10 nucleic acid, and is also intended to include, for example, 
a protein consisting of a non-full-length polypeptide 
having, e.g., a partial N-terminal deletion of the amino 
acid sequence, or a protein homologous to such an amino 
acid sequence, each of which has properties inherent to the 

15 protein, as illustrated below. 

First, the human G34 enzyme protein of the present 
invention may preferably have an amino acid sequence 
covering amino acid 189 to the C- terminal end of SEQ ID NO: 
2, more preferably an amino acid sequence covering amino 

20 acid 36 to the C-terminal end as obtained in the Example 
section described later. Likewise, the mouse G34 enzyme 
protein of the present invention may preferably have an 
amino acid sequence covering amino acid 35 to the 
C-terminal end of SEQ ID NO: 4. 

25 Moreover, in proteins usually having physiological 

activities equivalent to enzymes, it is well known that the 
physiological activities are maintained even when their 
amino acid sequences have substitution, deletion, insertion 

- 30 - 



or addition of one or more amino acids. It is also known 
that among naturally-occurring proteins, there are mutated 
proteins which have gene mutations resulting from 
differences in the species of source organisms and/or 
5 differences in ecotype or which have one or more amino acid 
mutations resulting from the presence of closely resembling 
isozymes, etc. In light of this point, the protein of the 
present invention also encompasses mutated proteins which 
have an amino acid sequence with substitution, deletion, 

10 insertion or addition of one or more amino acids in each 
amino acid sequence shown in SEQ ID NO: 2 or 4 and which 
have the ability to transfer a GalNAc residue to a GlcNAc 
residue with pi -3 glycosidic linkage under given enzymatic 
reaction conditions as mentioned above. Moreover, 

15 particularly preferred are modified proteins having amino 
acid sequences with substitution, deletion, insertion or 
addition of one or several amino acids in each amino acid 
sequence shown in SEQ ID NO: 2 or 4 . 

The expression "one or more amino acids" found above 

20 means preferably 1 to 200 amino acids, more preferably 1 to 
100 amino acids, even more preferably 1 to 50 amino acids, 
and most preferably 1 to 20 amino acids. In general, in a 
case where amino acid substitution occurs as a result of 
site -specific mutagenesis, the number of amino acids which 

25 can be substituted while maintaining the activities 

inherent to the original protein is preferably 1 to 10. 

The modified protein of the present invention also 
includes those obtained by substitution between 
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functionally equivalent amino acids. Namely, it is 
generally well known to those skilled in the art that 
recombinant proteins having a desired mutation(s) can be 
prepared by procedures involving introduction of 
5 substitution between functionally equivalent amino acids 
(e.g., replacement of one hydrophobic amino acid with 
another hydrophobic amino acid, replacement of one 
hydrophilic amino acid with another hydrophilic amino acid, 
replacement of one acidic amino acid with another acidic 

10 amino acid, or replacement of one basic amino acid with 
another basic amino acid) . The modified proteins thus 
obtained often have the same properties as the original 
protein. In light of this point, modified proteins having 
such amino acid substitutions also fall within the scope of 

15 the present invention. 

Moreover, the modified protein of the present 
invention may be a glycoprotein having sugar chains 
attached to the polypeptide as long as it has such an amino 
acid sequence as defined above and has an enzymatic 

2 0 activity inherent to the intended enzyme. 

To identify the range of the homologous protein of 
the present invention, an identity search using GENETYX 
software (Genetyx Corporation, Japan) is performed for the 
amino acid sequence shown in SEQ ID NO: 2 or 4 of the 

2 5 present invention, indicating that the amino acid sequence 
shares 14% identity with a known |3l,4GalNAc transferase 
showing the highest homology (Non -patent Document 1 listed 
above) and also shares 30% identity with a known pl,3Gal 



transferase showing the highest homology (Non-patent 
Document 2 listed above). In light of these points, a 
preferred amino acid sequence for the homologous protein of 
the present invention preferably shares more than 30% 
5 identity, more preferably at least 40% identity, and 
particularly preferably at least 50% identity with the 
amino acid sequence shown in SEQ ID NO: 2 or 4 . 

Likewise, the amino acid sequences shown in SEQ ID 
NOs : 2 and 4 share 88% identity with each other. In light 

10 of this point, a preferred amino acid sequence for the 

homologous protein of the present invention can be defined 
as sharing at least 88%, more preferably 90% identity with 
the amino acid sequence within SEQ ID NO: 2. 

The above GENETYX is genetic information processing 

15 software for nucleic acid/protein analysis and enables 

standard analyses of homology and multialignment , as well 
as signal peptide prediction, promoter site prediction and 
secondary structure prediction. The homology analysis 
program used herein employs the Lipman- Pearson method 

20 (Lipman, D.J. & Pearson, W.R., Science, 277, 1435-1441 

(1985)) frequently used as a rapid and sensitive method. 
In the present invention, the percentage of identity may be 
determined by comparing sequence information using, e.g., 
the BLAST program described by Altschul et al. (Nucl. Acids. 

25 Res., 25. 3389-3402 (1997)) or the FASTA program described 
by Pearson et al. (Proc. Natl. Acad. Sci. USA, 2444-2448 
(1988)). These programs are available on the Internet at 
the web site of the National Center for Biotechnology 
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Information (NCBI) or the DNA Data Bank of Japan (DDBJ) . 
The details of various conditions (parameters) for each 
identity search using each program are shown on these web 
sites, and default values are commonly used for these 
5 searches although part of the settings may be changed as 
appropriate. It is also possible to use other sequence 
comparison programs used by those skilled in the art. 

Thirdly, the isolated protein of the present 
invention may be administered as an immunogen to an animal 

10 to produce an antibody against the protein, as described 
later. Such an antibody may be used for immunoassays to 
measure and quantify the enzyme. Thus, the present 
invention is also useful in preparing such an immunogen . 
In light of this point, the protein of the present 

15 invention also includes a polypeptide fragment, mutant or 
fusion protein thereof, which contains an antigenic 
determinant or epitope for eliciting antibody formation. 
(4) Isolation and purification of the G34 enzyme protein of 
the present invention 

20 The enzyme protein of the present invention may be 

isolated and purified in the following manner. 

Recent studies have established genetic engineering 
procedures which involve culturing and growing a 
transformant and isolating and purifying a substance of 

2 5 interest from the resulting culture or grown transformant. 
The enzyme protein of the present invention may also be 
expressed (produced), e.g., by culturing in a nutrient 
medium a transformant containing an expression vector 
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carrying the nucleic acid of the present invention. 

A nutrient medium used for transformant culturing 
preferably contains a carbon source, an inorganic nitrogen 
source or an organic nitrogen source required for host cell 
5 (transformant) growth. Examples of a carbon source include 
glucose, dextran, soluble starch, sucrose and methanol. 
Examples of an inorganic or organic nitrogen source include 
ammonium salts, nitrate salts, amino acids, corn steep 
liquor, peptone, casein, meat extracts, soybean meal and 

10 potato extracts. If desired, the medium may contain other 
nutrients such as inorganic salts (e.g., sodium chloride, 
calcium chloride, sodium dihydrogen phosphate, magnesium 
chloride), vitamins, and antibiotics (e.g., tetracycline, 
neomycin, ampicillin, kanamycin) . Culturing may be 

15 accomplished in a manner known in the art. Culture 

conditions such as temperature, medium pH and culture 
period may be appropriately selected such that the protein 
according to the present invention is produced in a large 
quantity. 

2 0 The enzyme protein of the present invention may be 

obtained from the above culture or grown transformant as 
follows. Namely, in a case where a protein of interest is 
accumulated in host cells, the host cells may be collected 
by manipulations such as centrif ugation or filtration, 

25 suspended in an appropriate buffer (e.g., Tris buffer, 
phosphate buffer, HEPES buffer or MES buffer at a 
concentration around 10 to 100 mM, the pH of which will 
vary from buffer to buffer, but desirably falls within the 
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range of 5.0 to 9.0), and then crushed in a manner suitable 
for the host cells used, followed by centrif ugation to 
obtain the contents of the host cells. On the other hand, 
in a case where a protein of interest is secreted from host 
5 cells, the host cells and the medium are separated from 
each other by manipulations such as centrif ugation or 
filtration to obtain a culture filtrate. The crushed host 
cell solution or culture filtrate may be provided directly 
or may be treated by ammonium sulfate precipitation and 

10 dialysis before being provided for isolation and 
purification of the protein. 

Isolation and purification of a protein of interest 
may be accomplished in the following manner. Namely, in a 
case where the protein is labeled with a tag such as 6 x 

15 histidine, GST or maltose-binding protein, the isolation 
and purification may be accomplished by affinity 
chromatography suitable for each of the commonly used tags. 
On the other hand, in a case where the protein according to 
the present invention is produced without being labeled 

20 with such a tag, the isolation and purification may be 

accomplished, e.g., by ion exchange chromatography, which 
may further be combined with gel filtration, hydrophobic 
chromatography, isoelectric chromatography, etc. 

Moreover, an expression vector may be constructed to 

25 facilitate isolation and purification. In particular, the 
isolation and purification is facilitated if an expression 
vector is constructed to express a fusion protein of a 
polypeptide having an enzymatic activity with a labeling 



peptide and the enzyme protein is prepared in a genetic 
engineering manner. An example of the above identification 
peptide is a peptide having the function of facilitating 
secretion, separation, purification or detection of the 
5 enzyme according to the present invention from the grown 
transf ormant by allowing the enzyme to be expressed as a 
fusion protein in which the identification peptide is 
attached to a polypeptide having an enzymatic activity when 
the enzyme according to the present invention is prepared 

10 by gene recombination techniques. 

Examples of such an identification peptide include 
peptides such as a signal peptide (a peptide composed of 15 
to 30 amino acid residues, which is present at the N- 
terminal end of many proteins and is functional in cells 

15 for protein selection in the intracellular membrane 

permeation mechanism; e.g., OmpA, OmpT, Dsb) , protein 
kinase A, Protein A (a protein with a molecular weight of 
about 42,000, which is a component constituting the 
Staphylococcus aureus cell wall), glutathione S transferase, 

20 His tag (a sequence consisting of 6 to 10 histidine 

residues in series), myc tag (a 13 amino acid sequence 
derived from cMyc protein), FLAG peptide (an analysis 
marker composed of 8 amino acid residues), T7 tag (composed 
of the first 11 amino acid residues of the gene 10 protein), 

25 S tag (composed of pancreas RNase A-derived 15 amino acid 

residues), HSV tag, pelB (a 22 amino acid sequence from the 
E. coll external membrane protein pelB), HA tag (composed 
of hemagglutinin- derived 10 amino acid residues), Trx tag 
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(thioredoxin sequence), CBP tag ( calmodulin-binding 
peptide), CBD tag (cellulose-binding domain), CBR tag 
(collagen-binding domain), p-lac/blu (|3-lactamase) , (3-gal 
((3-galactosidase) , luc ( lucif erase) , HP-Thio (His-patch 
5 thioredoxin), HSP (heat shock peptide), Lny ( laminin 

y-peptide), Fn (fibronectin partial peptide), GFP (green 
fluorescent peptide), YFP (yellow fluorescent peptide), CFP 
(cyan fluorescent peptide), BFP (blue fluorescent peptide), 
DsRed, DsRed2 (red fluorescent peptides), MBP (maltose- 
10 binding peptide), LacZ (lactose operator), IgG 

(immunoglobulin G) , avidin and Protein G, any of which can 
be used. 

Among them, particularly preferred are the signal 
peptide, protein kinase A, Protein A, glutathione S 

15 transferase, His tag, myc tag, FLAG peptide, T7 tag, S tag, 
HSV tag, pelB and HA tag because they facilitate expression 
and purification of the enzyme according to the present 
invention through genetic engineering procedures . In 
particular, it is preferable to obtain the enzyme as a 

20 fusion protein with FLAG peptide ( Asp-Tyr-Lys -Asp-Asp-Asp- 
Asp-Lys) because it is very easy to handle. The above FLAG 
peptide is extremely antigenic and provides an epitope 
capable of reversible binding of a specific monoclonal 
antibody, thus enabling rapid assay and easy purification 

25 of the expressed recombinant protein. A mouse hybridoma 
called 4E11 produces a monoclonal antibody which binds to 
FLAG peptide in the presence of a certain divalent metal 
cation, as described in United States Patent No. 5,011,912 
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(incorporated herein by reference). A 4E11 hybridoma cell 
line has been deposited under Accession No. HB 9259 with 
the American Type Culture Collection. The monoclonal 
antibody binding to FLAG peptide is available from Eastman 
5 Kodak Co., Scientific Imaging Systems Division, New Haven, 
Connecticut. 

pFLAG-CMV-1 (SIGMA) can be presented as an example of 
a basic vector which can be expressed in mammalian cells 
and enables obtaining the enzyme protein of the present 

10 invention as a fusion protein with the above FLAG peptide. 
Likewise, examples of a vector which can be expressed in 
insect cells include, but are not limited to, pFBIF (i.e., 
a vector prepared by integrating the region encoding FLAG 
peptide into pFastBac (Invitrogen Corporation); see the 

15 Example section described later) . Those skilled in the art 
will be able to select an appropriate basic vector 
depending on, e.g., the host cell, restriction enzyme and 
identification peptide to be used for expression of the 
enzyme . 

20 (5) Antibody recognizing the G34 enzyme protein of the 
present invention 

The present invention provides an antibody which is 
immunoreactive to the G34 enzyme protein. Such an antibody 
is capable of specifically binding to the enzyme protein 

25 via the antigen-binding site of the antibody (as opposed to 
non-specific binding). More specifically, a protein having 
the amino acid sequence of SEQ ID NO: 2 or 4 or a fragment, 
mutant or fusion protein thereof may be used as an 
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immunogen for producing an antibody immunoreactive to each 
of them. 

More specifically, such a protein, fragment, mutant 
or fusion protein contains an antigenic determinant or 
5 epitope for eliciting antibody formation. These antigenic 
determinant and epitope may be either linear or 
conformational (discontinuous). The antigenic determinant 
or epitope can be identified by any technique known in the 
art. Thus, the present invention also relates to an 

10 antigenic epitope of the G34 enzyme protein. Such an 

epitope is useful in preparing an antibody, particularly a 
monoclonal antibody, as described in more detail below. 

The epitope of the present invention can be used in 
assays and as a research reagent for purifying a specific 

15 binding antibody from materials such as polyclonal sera or 
supernatants from cultured hybridomas . Such an epitope or 
a variant thereof may be prepared using techniques known in 
the art (e.g., solid phase synthesis, chemical or enzymatic 
cleavage of a protein) or using recombinant DNA technology. 

20 The enzyme protein of the present invention may be 

used to derive any embodiment of an antibody. If the 
entire or partial polypeptide of or an epitope of the 
protein has been isolated, both polyclonal and monoclonal 
antibodies can be prepared using conventional techniques. 

25 See, e.g., Kennet et al. (eds.), Monoclonal Antibodies, 

Hybridomas: A New Dimension in Biological Analyses, Plenum 
Press, New York, 1980. 

The present invention also provides a hybridoma cell 
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line producing a monoclonal antibody specific to the G34 
enzyme protein. Such a hybridoma can be produced and 
identified by conventional techniques. One method for 
producing such a hybridoma cell line involves immunizing an 
5 animal with the enzyme protein of the present invention, 
collecting spleen cells from the immunized animal, fusing 
the spleen cells with a myeloma cell line to give hybridoma 
cells , and identifying a hybridoma cell line which produces 
a monoclonal antibody binding to the enzyme. The resulting 
10 monoclonal antibody may be collected by conventional 
techniques . 

The monoclonal antibody of the present invention 
encompasses chimeric antibodies, for example, humanized 
mouse monoclonal antibodies. Such a humanized antibody is 
15 advantageous in reducing immunogenicity when administered 
to a human subject. 

The present invention also provides an antigen- 
binding fragment of the above antibody. Examples of an 
antigen-binding fragment which can be produced by 
20 conventional techniques include, but are not limited to, 
Fab and F(ab') 2 fragments. The present invention also 
provides an antibody fragment and derivative which can be 
produced by genetic engineering techniques . 

The antibody of the present invention can be used in 
25 assays to detect the presence of the G34 enzyme protein of 
the present invention or a polypeptide fragment thereof, 
either in vitro or in vivo. The antibody of the present 
invention may also be used in purifying the G3 4 enzyme 
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protein or a polypeptide fragment thereof by immunoaf f inity 
chromatography • 

Moreover, the antibody of the present invention may 
also be provided as a blocking antibody capable of blocking 
5 the binding of the above glycosyltransf erase protein to its 
binding partner (e.g., acceptor substrate), thus inhibiting 
the enzyme's biological activity resulting from such 
binding. Such a blocking antibody may be identified using 
any suitable assay procedure, for example, by testing the 

10 antibody for the ability to inhibit the binding of the 

protein to certain cells expressing an acceptor substrate. 

Alternatively, the blocking antibody may also be 
identified in assays for the ability to inhibit a 
biological effect resulting from the enzyme protein bound 

15 to its binding partner in target cells. Such an antibody 
may be used in an in vitro procedure or administered in 
vivo to inhibit a biological activity mediated by the 
entity that generated the antibody. Thus, the present 
invention also provides an antibody for treating disorders 

20 which are caused or exacerbated by either direct or 

indirect interaction between the G34 enzyme protein and its 
binding partner. Such therapy will involve in vivo 
administration of the blocking antibody to a mammal in an 
amount effective for inhibiting a binding partner-mediated 

25 biological activity. For use in such therapy, monoclonal 
antibodies are preferred and, in one embodiment, an 
antigen-binding antibody fragment is used. 

(6) Nucleic acid of the present invention for canceration 



assay 

In response to the discovery of the above G34 enzyme 
protein, the inventors of the present invention have 
confirmed that mRNA encoding this protein is widely found 
5 in cancerous tissues and cell lines and that the expression 
level of the mRNA is significantly increased particularly 
in cancerous tissues. Thus, the G34 nucleic acid is useful 
as a tumor marker that is useful for, e.g., cancer 
diagnosis targeted at biological samples containing 

10 transcription products. In this aspect, the present 

invention provides a nucleic acid for measurement, which is 
capable of hybridizing under stringent conditions to a 
nucleic acid defined by the nucleotide sequence shown in 
SEQ ID NO: 1 or 3. 

15 In one embodiment, the nucleic acid for measurement 

of the present invention is a primer or probe targeting the 
G34 nucleic acid in a biological sample and having a 
nucleotide sequence selected from the nucleotide sequence 
of SEQ ID NO: 1 or 3. In particular, since the nucleotide 

20 sequence of SEQ ID NO: 1 is derived from mRNA encoding a 

structural gene and contains the entire open reading frame 
(ORF) of the G34 gene, full-length or nearly full-length 
sequences of SEQ ID NO: 1 or 3 are usually found in 
transcription products from a biological sample. In light 

25 of this point, the primer or probe according to the present 
invention has a desired partial sequence selected from each 
nucleotide sequence of SEQ ID NO: 1 or 3 (either homologous 
or complementary to the selected sequence depending on the 
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intended use) and hence can be provided as a nucleic acid 
capable of specifically hybridizing to the target sequence. 

Typical examples of such a primer or probe include a 
native DNA fragment derived from a nucleic acid having at 
5 least a part of the nucleotide sequence shown in SEQ ID NO: 
1 or 3 , a DNA fragment synthesized to have at least a part 
of the nucleotide sequence shown in SEQ ID NO: 1 or 3 , or 
complementary strands of these fragments. 

Such a primer or probe as mentioned above may be used 
10 to detect and/or quantify the target nucleic acid in a 

biological sample, as described later. Since sequences on 
the genome can also be targeted, the nucleic acid of the 
present invention may also be used as an antisense primer 
for medical research or gene therapy. 
15 (A) Probe of the present invention 

In a preferred embodiment, the nucleic acid for 
measurement of the present invention is a probe targeting a 
nucleic acid having the nucleotide sequence of SEQ ID NO: 1 
or 3 or a complementary strand of at least one of them. 
20 The probe contains an oligonucleotide composed of at least 
a dozen nucleotides, preferably at least 15 nucleotides, 
preferably at least 17 nucleotides, and more preferably at 
least 20 nucleotides selected from the nucleotide sequences 
of SEQ ID NOs: 1 and 3, or a complementary strand of the 
25 oligonucleotide, or full-length cDNA of its ORF region or a 
complementary strand of the cDNA. 

In a case where the nucleic acid for measurement of 
the present invention is provided as an oligonucleotide 
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probe, it is understood that a length of a dozen 
nucleotides (e.g., 15 nucleotides, preferably 17 
nucleotides) may be sufficient for the nucleic acid to 
specifically hybridize under stringent conditions to its 
5 target nucleic acid. Namely, those skilled in the art will 
be able to select an appropriate partial sequence composed 
of at least 15 to 20 nucleotides from the nucleotide 
sequence of SEQ ID NO: 1 or 3 in accordance with known 
various strategies for oligonucleotide probe design. In 

10 this case, the amino acid sequence information shown in SEQ 
ID NO: 2 or 4 is helpful in selecting a unique sequence 
that may be suitable as a probe. 

Likewise, in the case of a cDNA probe, for example, a 
probe with a high molecular weight is generally difficult 

15 to handle when used as a reagent or diagnostic agent for 

medical research. In light of this point, the probe of the 
present invention intended for medical research includes a 
nucleic acid composed of 50 to 500 nucleotides, more 
preferably 60 to 300 nucleotides selected from each 

20 nucleotide sequence of SEQ ID NO: 1 or 3 . 

The term "stringent conditions" found above means 
conditions of moderate or high stringency as explained 
earlier. Those skilled in the art will be able to readily 
determine and achieve conditions of moderate or high 

25 stringency suitable for the selected probe, on the basis of 
common knowledge and empirical rule about known procedures 
for various probe designs and hybridization conditions. 

Although depending on, e.g., the nucleotide length to 
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be selected and the hybridization conditions to be applied, 
a relatively short oligonucleotide probe can serve as a 
probe even when it has a mismatch of one or several 
nucleotides, particularly one or two nucleotides, in 
5 comparison with the nucleotide sequence of SEQ ID NO: 1 or 
3. Likewise, a relatively long cDNA probe can also serve 
as a probe even when it has a mismatch of 50% or less, 
preferably 20% or less, in comparison with the nucleotide 
sequence of SEQ ID NO: 1 or a nucleotide sequence 

10 complementary thereto. 

The probe of the present invention thus designed can 
be used as a labeled probe having a label such as a 
fluorescent label, a radioactive label or a biotin label, 
in order to detect or confirm a hybrid formed with a target 

15 sequence in G3 4. 

For example, the labeled probe of the present 
invention may be used for confirmation or quantification of 
PCR amplification products from the G34 nucleic acid. In 
this case, it is preferable to use a probe targeting the 

20 nucleotide sequence located in a region between a pair of 
primer sequences used for PCR. An example of such a probe 
may be an oligonucleotide consisting of the nucleotide 
sequence shown in SEQ ID NO: 16 (corresponding to a 
complementary strand against nucleotides 525 to 556 in SEQ 

25 ID NO: 1) (see Example 3). 

The probe of the present invention may be included in 
a kit such as a diagnostic DNA probe kit or may be 
immobilized on a chip such as a DNA microarray chip. 
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(B) Primers of the present invention 

In a preferred embodiment, the primers obtained from 
the nucleic acid for the canceration assay of the present 
invention are oligonucleotide primers . To prepare 
5 oligonucleotide primers, two regions may be selected from 
the ORF region of the nucleotide sequence shown in SEQ ID 
NO: 1 or 3 in such a manner as to satisfy the following 
conditions : 

a) the length of each region is at least several tens of 
10 nucleotides, particularly at least 15 nucleotides, 

preferably at least 17 nucleotides, more preferably at 
least 20 nucleotides, and at most 50 nucleotides; and 

b) the G+C content in each region is 40% to 70%. 
In actual fact, oligonucleotide primers may be 

15 prepared as single-stranded DNAs having nucleotide 

sequences identical or complementary to the two regions 
thus selected, or may be prepared as single- stranded DNAs 
modified not to lose the binding specificity to these 
nucleotide sequences. Although each primer of the present 

20 invention preferably has a sequence that is completely 

complementary to the selected target sequence, a mismatch 
of one or two nucleotides may be permitted. 

Examples of the pair of primers according to the 
present invention include a pair of oligonucleotides 

25 consisting of SEQ ID NOs : 14 and 15 (corresponding to 

complementary strands against nucleotides 481-501 and 562- 
581 in SEQ ID NO: 1, respectively) for human G34, and a 
pair of oligonucleotides consisting of SEQ ID NOs: 17 and 



18 (corresponding to complementary strands against 
nucleotides 481-501 and 562-581 in SEQ ID NO: 3, 
respectively) for mouse G34. 

(7) Canceration assay according to the present invention 
5 As described earlier, the G34 nucleic acid of the 

present invention was confirmed to show a significant 
increase in the expression level (i.e., transcription level 
of the gene from the genome into mRNA) in a cancerous 
biological sample when compared to a normal biological 
10 sample. The G34 nucleic acid of the present invention was 
demonstrated to be useful at least in a canceration assay 
for large intestine (colon) cancer or lung cancer (see 
Example 3 ) . 

According to detailed embodiments of the canceration 
15 assay of the present invention, transcription products 
extracted from a biological sample or a nucleic acid 
library derived therefrom may be used as a test sample and 
measured for the amount of the G34 nucleic acid (typically 
the amount of its mRNA) using the above probe or primer to 
20 determine whether the measured value is significantly 

higher than that of a normal biological sample. In this 
case, if the measured value of the test biological sample 
is significantly higher than the reference value of the 
normal biological sample, the test biological sample is 
2 5 determined as being cancerous or having a high grade of 
malignancy. 

In the canceration assay of the present invention, 
the reference value for a normal biological sample used as 
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a control may be a value measured for a control site 
(typically a normal site) in the same tissue of the same 
patient or may be a value normalized from known data 
obtained in a control site, e.g., the mean value of mRNA 
5 levels in normal tissues. 

According to the measurement of expression levels 
using the nucleic acid for measurement of the present 
invention, human G34 is found to be expressed at a high 
level in the brain, skeletal muscle, pancreas, adrenal 

10 gland, testis and prostate when measured in normal sites, 
and there is also significant expression in other sites, 
although at a relatively low level. This indicates that 
human G34 expression is widely found over various tissues 
and that the expression level of human G34 is significantly 

15 increased even in tissues with a relatively low expression 
level, such as large intestine (colon) and lung tissues. 
Once these data have been provided, those skilled in the 
art will recognize the actual utility and effect of the 
nucleic acid for measurement of the present invention. 

20 In this assay, whether the measured value for a test 

sample is significantly higher than that of a normal sample 
may be determined by the criteria that are set depending on 
the accuracy (positive rate) required for the assay or the 
grade of malignancy to be determined. The criteria may be 

2 5 freely set depending on the intended purpose; for example, 
the reference value to be determined as positive may be set 
to a lower value for the purpose of detecting tissues with 
a high grade of malignancy or may be set to a higher value 
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for the purpose of comprehensively detecting test samples 
with signs or risk of canceration. 

Examples will be given below of hybridization and PCR 
assays to illustrate the canceration assay of the present 
5 invention. 

(A) Hybridization assay 

Embodiments of this assay include those using a probe 
obtained from the nucleic acid of the present invention, 
e.g., methods using various hybridization assays well known 

10 to those skilled in the art, exemplified by Southern 
blotting. Northern blotting, dot blotting or colony 
hybridization. In the case of requiring amplification 
and/or quantification of the detected signal, these methods 
may further be combined with immunoassay. 

15 According to typical hybridization assays, a nucleic 

acid extracted from a biological sample or an amplification 
product thereof may be immobilized on a solid phase and 
hybridized with a labeled probe under stringent conditions. 
After washing, the label attached to the solid phase may be 

20 measured. 

Extraction and purification of transcription products 
from a biological sample may be accomplished by using any 
method known to those skilled in the art . 

(B) PCR assay 

25 In a preferred embodiment, the canceration assay of 

the present invention includes PCR methods based on nucleic 
acid amplification using the primers of the present 
invention. The details of PCR are as explained earlier. In 
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this subsection, a detailed PCR-based embodiment of this 
assay will be explained. 

G34 mRNA in transcription products to be assayed can 
be amplified by PCR using a pair of primers located at both 
5 ends of a given region selected from the nucleotide 

sequence of G34. In this step, if even trace amounts of 
G3 4 nucleic acid fragments are present in an analyte, these 
fragments will serve as templates to replicate and amplify 
the nucleic acid region between the primer pair. After 

10 repeating a given number of PCR cycles, the nucleic acid 
fragments serving as templates are each amplified to a 
desired concentration. Under the same amplification 
conditions, the amplification product will be obtained in 
proportion to the amount of G3 4 mRNA present in the analyte. 

15 Then, the above probe or the like targeting the amplified 
region may be used to confirm whether the amplification 
product is the nucleic acid of interest and also quantify 
the same. Likewise, the nucleic acid in a normal tissue 
may also be measured in the same manner. In this case, a 

20 nucleic acid of a gene that is widely and usually present 
in the same tissue or the like (e.g., a nucleic acid 
encoding glyceraldehyde-3-phosphate dehydrogenase (GAPDH) 
or (3 -act in) may be used as a control to remove variations 
among individuals . The measured value for the 

25 transcription level of G34 is provided for comparison to 
assay the presence of canceration or the grade of 
malignancy, as described above. 

A nucleic acid sample provided for PCR methods may be 
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either total mRNA extracted from a biological sample (e.g., 
a test tissue or cell) or total cDNA reverse transcribed 
from mRNA. In a case where mRNA is amplified, the NASBA 
method (3SR method, TMA method) using the primer pair 
5 mentioned above may be employed. Since the NASBA method 
per se is well known and kits for this method are 
commercially available, the method may be readily 
accomplished by using the primer pair of the present 
invention . 

10 To detect or quantify the above amplification product, 

the reaction solution after amplification may be 
electrophoresed and the resulting bands may be stained with 
ethldium bromide or the like, or alternatively, the 
electrophoresed amplification product may be immobilized 

15 onto a solid phase (e.g., a nylon membrane), hybridized 
with a labeled probe specifically hybridizing to a test 
nucleic acid (e.g., a probe having the nucleotide sequence 
of SEQ ID NO: 16) and washed, followed by detection of the 
label. 

20 Examples of PCR methods preferred for this assay 

include quantitative PCR, especially kinetic RT-PCR or 
quantitative real-time PCR. In particular, quantitative 
real-time RT-PCR targeted at mRNA libraries is preferred in 
view that it allows direct purification of a target to be 

25 measured from a biological sample and directly reflects the 
transcription level. However, the nucleic acid 
quantification in this assay is not limited to quantitative 
PCR. Other known quantitative DNA assays (e.g.. Northern 
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blotting, dot blotting, DNA microarray) using the above 
probe may also be applied to the PCR products . 

Moreover, when performed using a quencher fluorescent 
dye and a reporter fluorescent dye, quantitative RT-PCR 
5 also enables quantification of a target nucleic acid in an 
analyte. In particular, it may be readily performed since 
kits for quantitative RT-PCR are commercially available. 
Moreover, a target nucleic acid may also be semi -quantified 
based on the intensity of the corresponding electrophoretic 
10 band. 

(C) Assay for therapeutic effect on cancer 

Other embodiments of the canceration assay of the 
present invention include an assay for determining the 
effect of curing or alleviating cancer. For example, 

15 targets of this assay include all treatments such as 

administration of an anticancer agent and radiation therapy, 
and targets of these treatments include in vitro cancer 
cells or cancer tissues derived from cancer patients or 
experimental animal models for carcinogenesis. 

20 According to this assay, in a case where a biological 

sample is subjected to a certain treatment, it is possible 
to know the therapeutic effect of the treatment on cancer 
by determining whether the transcription level of the G34 
nucleic acid in the biological sample is reduced due to the 

25 treatment. This assay is not limited to a determination 

whether the transcription level is reduced, and the result 
may also be evaluated as effective when an increase in the 
transcription level is significantly prevented. The 
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transcription level may not only be compared with that of 
an untreated tissue, but also traced over time after the 
treatment . 

The assay of the present invention for therapeutic 
5 effect on cancer includes, for example, a determination 
whether a candidate substance for an anticancer agent is 
effective for cancerous tissues, whether resistance is 
developed to an anticancer agent in cancer patients 
receiving the agent, or whether a candidate substance for 

10 an anticancer agent is effective for diseased tissues or 

the like in experimental animal models. Test tissues from 
experimental animal models are not limited to in vitro 
samples, and also include in vivo or ex vivo samples. 
(8) Creation of genetically engineered animal 

15 As described earlier, the inventors of the present 

invention have identified the presence of mouse G34 and its 
nucleic acid sequence (SEQ ID NO: 3). The present 
invention also relates to a means for expression and 
functional analysis of G34 at the animal level on the basis 

20 of various gene conversion techniques using fertilized eggs 
or ES cells, typically relates to creating transgenic 
animals into which the G34 gene is introduced and knockout 
mice which are deficient in mouse G34, etc. 

For example, the creation of knockout mice may be 

25 accomplished in accordance with routine techniques in the 

art (see, e.g.. Newest Technique for Gene Targeting, edited 
by Takeshi Yagi, Yodosha Co., Ltd., Japan; Gene targeting, 
translated and edited by Tetsuo Noda, Medical Science 



International, Ltd., Japan). Namely, those skilled in the 
art will be able to obtain G34 homologous recombinant ES 
cells in accordance with known gene targeting techniques 
using sequence information of the mouse G34 nucleic acid 
disclosed herein, thus creating G34 knockout mice using 
these cells (see Example 7). 

Recently, a method has been developed to prevent gene 
expression by small interfering RNA (T.R. Brummelkamp et 
al., Science, 296, 550-553 (2002)); it is also possible to 
create G3 4 knockout mice in accordance with such a known 
method. 

The provision of G34 knockout mice will be helpful in 
elucidating the involvement of the G34 gene in certain 
vital phenomena, i.e., information on redundancy of the 
gene, the relationship between deficiency of the gene and 
phenotype at the animal level (including any type of 
abnormality affecting motor, mental and sensory functions), 
as well as functions of the gene during the animal life 
cycle including development, growth and ageing. More 
specifically, the knockout mice thus obtained may be used 
to detect a carrier of sugar chains synthesized by G34 and 
mG34 and to examine their relationship with physiological 
functions or diseases, etc. For example , glycoproteins and 
glycolipids may be extracted from each tissue derived from 
the knockout mice and compared with those of wild- type mice 
by techniques such as proteomics (e.g., two-dimensional 
electrophoresis , two-dimensional thin-layer chromatography, 
mass spectrometry) to identify a carrier of the synthesized 



sugar chains. Moreover, the relationship with 
physiological functions or diseases may be estimated by 
comparing phenotypes (e.g., fetal formation, growth process, 
spontaneous behavior) between knockout mice and wild- type 
5 mice. 

Definitions of terms 

As used herein to describe the transcription level of 
a nucleic acid, the term "measured value" or "expression 
level" refers to the amount of the nucleic acid present in 

10 transcription products from a fixed amount of a biological 
sample, i.e., the concentration of the nucleic acid. 
Moreover, since the assay of the present invention relies 
on the comparison of such measured values, even when a 
nucleic acid is amplified, e.g., by PCR for the purpose of 

15 quantification or even when signals from a probe label are 
amplified, these amplified values may also be provided for 
relative comparison. Thus, the "measured value for a 
nucleic acid" can also be understood as the amount of the 
nucleic acid after amplification or the signal level after 

20 amplification. 

As used herein, the term "target nucleic acid" or 
"the nucleic acid" encompasses all types of nucleic acids, 
regardless of in vivo or in vitro, including of course G34 
mRNA, as well as those obtained using the mRNA as a 

25 template. It should be noted that the term "nucleotide 
sequence" used herein also includes a complementary 
sequence thereof, unless otherwise specified. 

As used herein, the term "biological sample" refers 
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to an organ, tissue or cell, as well as an experimental 
animal -derived organ, tissue, cell or the like, preferably 
refers to a tissue or cell. Examples of such a tissue 
include the brain, fetal brain, cerebellum, medulla 
5 oblongata, submandibular gland, thyroid gland, trachea, 
lung, heart, skeletal muscle, esophagus, duodenum*, small 
intestine, large intestine (colon), rectum, colon, liver, 
fetal liver, pancreas, kidney, adrenal gland, thymus, bone 
marrow, spleen, testis, prostate, mammary gland, uterus and 

10 placenta, with the large intestine (colon) and lung being 
more preferred. 

As used herein, the term "measure", "measurement" or 
"assay" encompasses all of detection, amplification, 
quantification and semi-quantification. In particular, the 

15 assay according to the present invention relates to a 

canceration assay for a biological sample, as described 
above, and hence can be applied to, e.g., cancer diagnosis 
and treatment in the medical field. The term "canceration 
assay" used herein includes an assay as to whether a 

20 biological sample becomes cancer, as well as an assay as to 
whether the grade of malignancy is high. The term "cancer" 
used herein typically encompasses malignant tumors in 
general and also includes disease conditions caused by the 
malignant tumors. Thus, targets of the assay according to 

2 5 the present invention include, but are not necessarily 

limited to, neuroblastoma, glioma, lung cancer, esophageal 
cancer, gastric cancer, pancreatic cancer, liver cancer, 
kidney cancer, duodenal cancer, small intestine cancer, 
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large intestine (colon) cancer, rectal cancer, colon cancer 
and leukemia, with large intestine (colon) cancer and lung 
cancer being preferred* 

The present invention will now be illustrated in more 
5 detail by way of the following examples. 
[EXAMPLES] 

Example 1: Cloning and expression of human G34 gene, as 
well as purification of the expressed protein 

(33 galactosyltransf erase 6 (|33GalT6) was used as a 

10 query for a BLAST search to thereby find a nucleic acid 
sequence with homology ( SEQ ID NO: 1). The open reading 
frame (ORF) estimated from the nucleic acid sequence is 
composed of 1503 bp, i.e., 500 amino acids (SEQ ID NO: 2) 
when calculated as an amino acid sequence. The product 

15 encoded by these nucleic acid and amino acid sequences was 
designated human G34. 

The amino acid sequence of G34 has a hydrophobic 
amino acid region characteristic of glycosyltransf erases at 
its N- terminal end and shares a homology of 47% (nucleic 

2 0 acid sequence) and 2 8% (amino acid sequence) with the above 
(33GalT6. The amino acid sequence of G34 also retains all 
of the three motifs conserved in the (33GalT family. 

In this example, G34 was not only confirmed for its 
expression in mammalian cells, but also allowed to be 

25 expressed in insect cells for further examination of its 
activity. 

For activity confirmation, it would be sufficient to 
express at least an active region covering amino acid 189 
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to the C- terminal end of SEQ ID NO: 1, which is relatively 
homologous to |33GalT6. In this example, however, an active 
region covering amino acid 36 to the C-terminal end was 
attempted to be expressed. 

Confirmation of human G34 gene expression in mammalian 
cells 

The active region covering amino acid 3 6 to the 
C- terminal end of G34 was genetically introduced into a 
mammalian cell line expression vector pFLAG-CMV3 using a 
FLAG Protein Expression system ( Sigma -Aldrich Corporation) . 
Since pFLAG-CMV3 has a multicloning site, a gene of 
interest can be introduced into pFLAG-CMV3 when the gene 
and pFLAG-CMV3 are treated with restriction enzymes and 
then subjected to ligation reaction. 

Kidney- derived cDNA (Clontech, Marathon-ready cDNA) 
was used as a template and subjected to PCR using a 5'- 
primer (G34-CMV-F1; SEQ ID NO: 5) and a 3 ' -primer (G34-CMV- 
Rl; SEQ ID NO: 6) to obtain a DNA fragment of interest. 
PCR was performed under conditions of 25 cycles of 98°C for 
10 seconds, 55°C for 30 seconds, and 72°C for 2 minutes. 
The PCR product was then electrophoresed on an agarose gel 
and isolated in a standard manner after gel excision. This 
PCR product has restriction enzyme sites Hindlll and BamHI 
at the 5' and 3' sides, respectively. 

After this DNA fragment and pFLAG-CMV3 were each 
treated with restriction enzymes Hindlll and BamHI, the 
reaction solutions were mixed together and subjected to 
ligation reaction, so that the DNA fragment was introduced 
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into pFLAG-CMV3. The reaction solution was purified by 
ethanol precipitation and then mixed with competent cells 
( E . coll DH5a) . After heat shock treatment (42°C, 30 
seconds), the cells were seeded on ampicillin-containing LB 
agar medium. 

On the next day, the resulting colonies were 
confirmed by direct PCR for the DNA of interest. For more 
reliable results, after sequencing to confirm the DNA 
sequence, the vector (pFLAG-CMV3-G34A) was extracted and 
purified. 

Human kidney cell-derived cell line 293T cells (2 x 
10 6 ) were suspended in 10 ml antibiotic-free DMEM medium 
(Invitrogen Corporation) supplemented with 10% fetal bovine 
serum, seeded in a 10 cm dish and cultured for 16 hours at 
37°C in a C0 2 incubator. pFLAG-CMV3-G3 4A (20 ng) and 
Lipofectamin 2000 (30 jxl, Invitrogen Corporation) were each 
mixed with 1 . 5 ml OPTI-MEM (Invitrogen Corporation) and 
incubated at room temperature for 5 minutes. These two 
solutions were further mixed gently and incubated at room 
temperature for 20 minutes. This mixed solution was added 
dropwise to the dish and cultured for 48 hours at 37°C in a 
C0 2 incubator. 

The supernatant (10 ml) was mixed with NaN 3 (0.05%), 
NaCl (150 mM), CaCl 2 (2 mM) and ant i - FLAG -Ml resin (100 
SIGMA), followed by overnight stirring at 4°C. On the next 
day, the supernatant was centrifuged (3000 rpm, 5 minutes, 
4°C) to collect a pellet fraction. After addition of 2 mM 
CaCl 2 -TBS (900 , centrif ugation was repeated (2000 rpm, 
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5 minutes, 4°C) and the resulting pellet was suspended in 
200 |Al of 1 mM CaCl 2 -TBS for use as a sample for activity 
measurement (G34 enzyme solution). A part of this sample 
was electrophoresed by SDS-PAGE and Western blotted using 
5 anti-FLAG M2 -peroxidase (SIGMA) to confirm the expression 
of the G34 protein of interest. 

As a result, a band was detected at a position of 
about 60 kDa, thus confirming the expression of the G34 
protein . 

10 Insertion of human G34 gene into insect cell expression 
vector 

The active region covering amino acid 36 to the 
C-terminal end of G34 was integrated into pFastBac 
(Invitrogen Corporation) in a GATEWAY system (Invitrogen 

15 Corporation). Moreover, a Bac-to-Bac system (Invitrogen 
Corporation) was also used to construct a bacmid. 
(1) Creation of entry clone 

Kidney- derived cDNA (Clontech, Marathon -ready cDNA) 
was used as a template and subjected to PCR using a 5'- 

20 primer (G34-GW-F1; SEQ ID NO: 7) and a 3 ' -primer (G34-GW- 
Rl; SEQ ID NO: 8) to obtain a DNA fragment of interest. 
PCR was performed under conditions of 25 cycles of 98°C for 
10 seconds, 55°C for 30 seconds, and 72°C for 2 minutes. 
The PCR product was then electrophoresed on an agarose gel 

25 and isolated in a standard manner after gel excision. 

This product was integrated into pDONR201 (Invitrogen 
Corporation) through BP clonase reaction to create an 
"entry clone." The reaction was accomplished by incubating 
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the DNA fragment of interest (5 , pDONR201 (1 |xl, 150 
ng), reaction buffer (2 and BP clonase. mix (2 at 

25°C for I hour. The reaction was stopped by addition of 
proteinase K (1 \xl) and incubation at 37°C for 10 minutes. 
The above reaction solution (1 [xl ) was then mixed with 
100 \xl competent cells (E. coll DH5a, TOYOBO) . After heat 
shock treatment, the cells were seeded in a kanamycin- 
containing LB plate. 

On the next day, colonies were collected and 
confirmed by direct PCR for the DNA of interest. For more 
reliable results, after sequencing to confirm the DNA 
sequence, the vector (pDONR-G34A.) was extracted and 
purified. 

(2) Creation of expression clone 

At both sides of the insertion site, the above entry 
clone has attL recombination sites for excision of lambda 
phage from E. coll. When the entry clone is mixed with LR 
clonase (a mixture of lambda phage recombination enzymes 
Int, IHF and Xis) and a destination vector, the insertion 
site is transferred to the destination vector to give an 
expression clone. Detailed steps are as shown below. 

First, the entry clone (1 (Jtl ) , pFBIF (0.5 75 ng), 

LR reaction buffer (2 \xl) , TE (4.5 and LR clonase mix 

(2 \xl) were reacted at 2 5°C for 1 hour. The reaction was 
stopped by addition of proteinase K (1 \xl) and incubation 
at 37°C for 10 minutes (this recombination reaction results 
in pFBIF-G34A) . pFBIF is a pFastBacl vector modified to 
have a IgK signal sequence (SEQ ID NO: 9) and a FLAG 



peptide for purification (SEQ ID NO: 10). The IgK signal 
sequence is inserted for the purpose of converting the 
expressed protein into a secretion form, while the FLAG 
peptide is inserted for the purpose of purification. To 
5 insert the FLAG peptide, a DNA fragment obtained from OT3 
(SEQ ID NO: 11) as a template using primers OT20 (SEQ ID 
NO: 12) and 0T21 (SEQ ID NO: 13) was inserted with Bam HI 
and Eco Rl . Further, to insert a Gateway sequence, a 
Gateway Vector Conversion system (Invitrogen Corporation) 

10 was used to introduce a Conversion cassette. 

Subsequently, the whole volume of the above mixed 
solution (11 jil) was mixed with 100 \xl. competent cells 
(£/. coll DH5a) . After heat shock treatment, the cells were 
seeded in an ampicillin-containing LB plate. On the next 

15 day, colonies were collected and confirmed by direct PCR 
for the DNA of interest, and the vector ( pFBIF-G34A) was 
extracted and purified. 

(3) Construction of bacmid by Bac-to-Bac system 

Next, a Bac-to-Bac system (Invitrogen Corporation) 

20 was used to cause recombination between the above pFBIF- 

and pFastBac, so that G3 4 and other sequences were inserted 
into a bacmid capable of growing in insect cells. 

This system utilizes a Tn7 recombination site and 
allows a gene of interest to be incorporated into a bacmid 

25 through a recombinant protein produced from a helper 
plasmid when pFastBac carrying the inserted gene of 
interest is merely introduced into bacmid-containing 
E. coll (DH10BAC, Invitrogen Corporation) . In addition, 



such a bacmid contains the lacZ gene and allows selection 
based on the classical blue (not inserted) /white (inserted) 
colony screening. 

Namely, the vector purified above (pFBIH-G34A) was 
mixed with 50 [il competent cells (E. coll DH10BAC) . After 
heat shock treatment, the cells were seeded in a LB plate 
containing kanamycin, gentamicin, tetracycline, Bluo-gal 
and IPTG. On the next day, white single colonies were 
further cultured to collect the bacmid. 
Introduction of human G34 gene-containing bacmid into 
insect cells 

After confirming that the sequence of interest was 
inserted into the bacmid obtained from the above white 
colonies, this bacmid was introduced into insect cells 
(Sf21, commercially available from Invitrogen Corporation). 

Namely, Sf21 cells were added to a 35 mm dish at 9 x 
10 5 cells/2 ml antibiotic-containing Sf-900SFM (Invitrogen 
Corporation) and cultured at 27°C for 1 hour to allow cell 
adhesion. (Solution A) Purified bacmid DNA (5 diluted 
with 100 \xl antibiotic-free Sf-900SFM. (Solution B) 
CellFECTIN Reagent (6 jxl, Invitrogen Corporation) diluted 
with 100 fxl antibiotic-free Sf-900SFM. Solutions A and B 
were then mixed carefully and incubated for 45 minutes at 
room temperature. After confirming cell adhesion, the 
culture solution was aspirated and replaced by antibiotic- 
free Sf-900SFM (2 ml). The solution prepared by mixing 
Solutions A and B (lipid-DNA complexes) was diluted and 
mixed carefully with antibiotic-free Sf900II (800 fil). The 
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culture solution was aspirated from the cells and replaced 
by the diluted solution of lipid-DNA complexes, followed by 
incubation at 27°C for 5 hours. The transfection mixture 
was then removed and replaced by antibiotic-containing 
Sf-900SFM culture solution (2 ml), followed by incubation 
at 27°C for 72 hours. At 72 hours after transfection, the 
cells were released by pipetting and collected together 
with the culture solution, followed by centrif ugation at 
3000 rpm for 10 minutes. The resulting supernatant was 
stored in another tube (which was used as a first virus 
solution) . 

Sf21 cells were introduced into a T75 culture flask 
at 1 x 10 7 cells/20 ml Sf-900SFM (antibiotic-containing) 
and incubated at 2 7°C for 1 hour. After the cells were 
adhered, the first virus (800 was added and cultured at 

27°C for 48 hours. After 48 hours, the cells were released 
by pipetting and collected together with the culture 
solution, followed by centrif ugation at 3000 rpm for 10 
minutes. The resulting supernatant was stored in another 
tube (which was used as a second virus solution) . 

Moreover, Sf21 cells were introduced into a T75 
culture flask at 1 x 10 7 cells/20 ml Sf-900SFM (antibiotic- 
containing) and incubated at 27°C for 1 hour. After the 
cells were adhered, the second virus solution (100 was 
added and cultured at 27°C for 72 hours. After culturing, 
the cells were released by pipetting and collected together 
with the culture solution, followed by centrif ugation at 
3000 rpm for 10 minutes. The resulting supernatant was 



stored in another tube (which was used as a third virus 
solution). In addition, Sf21 cells were introduced into a 
100 ml spinner flask at a concentration of 6 x lo 5 cells/ml 
in a volume of 100 ml. The third virus solution (1 ml) was 
5 added and cultured at 27°C for about 96 hours. After 
culturing, the cells and the culture solution were 
collected and centrifuged at 3000 rpm for 10 minutes. The 
resulting supernatant was stored in another tube (which was 
used as a fourth virus solution). 

10 Resin purification of G34 

The pFLAG-G34 supernatant of the above fourth virus 
solution (10 ml) was mixed with NaN 3 (0.05 %), NaCl (150 
mM) , CaCl 2 (2 mM) and ant i -FLAG-MI resin (100 SIGMA) , 

followed by overnight stirring at 4°C. On the next day, the 

15 mixture was centrifuged (3000 rpm, 5 minutes, 4°C) to 

collect a pellet fraction. After addition of 2 mM CaCl 2 -TBS 
(900 , centrif ugation was repeated (2000 rpm, 5 minutes, 
4°C) and the resulting pellet was suspended in 200 \il of 
1 mM CaCl 2 -TBS for use as a sample for activity measurement 

20 (G34 enzyme solution). A part of this sample was 

electrophoresed by SDS-PAGE and Western blotted using anti- 
FLAG M2 -peroxidase (SIGMA) to confirm the expression of the 
G34 protein of interest. As a result, a plurality of bands 
were detected broadly around a position of about 60 kDa 

25 (which would be due to differences in post- translational 
modifications such as glycosylation ) , thus confirming the 
expression of the G34 protein. 

Example 2: Search for glycosyltransf erase activity of human 
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G34 protein 

(1) Screening of GalNAc transferase activity 

The G34 protein was examined for its substrate 
specificity, optimum buffer, optimum pH and divalent ion 
5 requirement in its pi , 3-N-acetylgalactosaminyltransf erase 
activity. 

The following reaction system was used for examining 
the G34 enzyme protein for its acceptor substrate 
specificity in its GalNAc transfer activity. 

10 In the reaction solutions shown below, each of the 

following was used at 10 nmol as an acceptor substrate: 
pNp-ct-Gal, oNp-p-Gal, Bz-a-GlcNAc, pNp-p-GlcNAc , Bz-a- 
GalNAc, pNp- (3 -GalNAc, pNp-a-Glc, pNp-p-Glc, pNp-0-GlcA # 
pNp-a-Fuc, pNp-a-Xyl, pNp-p-Xyl and pNp-cx-Man (all 

15 purchased from SIGMA) , wherein "Gal" represents a 

D-galactose residue, "Xyl" represents a D-xylose residue, 
"Fuc" represents a D-fucose residue, "Man" represents a 
D-mannose residue and "GlcA" represents a glucuronic acid 
residue. 

20 Each reaction solution was prepared as follows (final 

concentrations in parentheses): each substrate (10 nmol), 
MES ( 2-morpholinoethanesulf onic acid) (pH 6.5, 50 mM) , 
MnCl 2 (10 mM), Triton X-100 (trade name) (0.1 %), UDP- 
GalNAc (2 mM) and UDP- [ 14 C]GlcNAc (40 nCi) were mixed and 

2 5 supplemented with 5 \il G34 enzyme solution, followed by 

dilution with H 2 0 to a total volume of 20 [xl (see Table 1). 



- 67 - 



) 



\ 



Table 1 



Composition of reaction solutions (fxl) 





E( + )..D( + ) 


! X8 


iE(-),D(+) 


iE(+),D(-) 


Enzyme solution 


5 


! 40 


: 0 


5 


140 mM HEPES 
pH 7.4 


' 2 


16 


2 


2 


100 mM UDP-GalNAc 


0.5 


4 


0.5 


0 


200 mM MnCl 2 


1 


8 


1 


1 


10% Triton CF-54 


0.6 


4.8 


0.6 


0 . 6 


H 2 0 


5.9 


47.2 


10.9 


6.4 


10 nmol/jxl Acceptor 


5 


40 


5 


5 


Total 


20| 




20 i 


20 



The above reaction mixtures were each reacted at 37°C 
for 16 hours. After completion of the reaction, 200 \x± H 2 0 
was added and each mixture was lightly centrifuged to 
obtain the supernatant. The supernatant was passed through 
a Sep-Pak plus C18 Cartridge (Waters), which had been 
washed once with 1 ml methanol and twice with 1 ml H 2 Q and 
then equilibrated, to allow the substrate and product in 
the supernatant to adsorb to the cartridge. After washing 
the cartridge twice with 1 ml H 2 0, the adsorbed substrate 
and product were eluted with 1 ml methanol. The eluate was 
mixed with 5 ml liquid scintillator ACSII (Amersham 
Biosciences) and measured for the amount of radiation with 
a scintillation counter (Beckman Coulter) . 

As a result, the G34 protein was identified to be 
GalNAc transferase having the ability to transfer GalNAc to 
pNp-p-GlcNAc . The enzymatic activity was linearly 

increased at least over the course of the reaction time 
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between 0 and 16 hours when UDP-GlcNAc was used as a donor 
substrate and Bz-[3-GlcNAc was used as an acceptor substrate 
(see Table 2 and Figure 1). 

Table 2 



Reaction time 


Area ( % ) 


1 hour 


0 


2 hours 


2.388 


4 hours 


6.195 


16 hours 


13.719 



5 

Determination of linking mode 

NMR was performed to analyze the linking mode of the 
sugar chain structure synthesized by the G34 enzyme protein. 
First, the reaction solution (final concentrations in 

10 parentheses) was prepared by adding Bz-p-GlcNAc (640 nmol) 
as an acceptor substrate, HEPES buffer (pH 7.4, 14 mM) , 
Triton CF-54 (trade name) (0.3 %) , UDP-GalNAc (2 mM) , MnCl 2 
(10 mM) and 500 \xl G34 enzyme solution, followed by 
dilution with H 2 0 to a total volume of 2 ml . This reaction 

15 solution was reacted at 37°C for 16 hours. The reaction 
solution was heated for 5 minutes at 95°C to stop the 
reaction and then purified by filtration through an 
Ultrafree-MC (Millipore Corporation). 

In one development, 50 \il of the filtrate was 

20 analyzed by high performance liquid chromatography (HPLC) 

using a reversed-phase column ODS-80Ts QA (4.6 x 250 mm, 

Tosoh Corporation, Japan). The developing solvent used was 

an aqueous 9% acetonitrile-0 . 1% trif luoroacetic acid 
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solution. The elution conditions were set to 1 ml/minute 
at 40°C. Absorbance at 210 nm was used as an index for 
elution peak detection using an SPD-10A vp (Shimadzu 
Corporation, Japan). As a result, a new elution peak was 
5 observed, which was not detected in the control. This peak 
was separated and lyophilized for use as an NMR sample. 

NMR was performed using a DMX750 (Bruker Daltonics ) . 
As a result, the sample was determined as having a pi-3 
linkage between GalNAc and GlcNAc-pl-o-Bz (see Figures 2A 

10 and 2B) . The reasons for this determination are as follows 
(see Figures 2A and 2B, along with Figures 3 and 4): a) two 
residues (referred to as A and B) both have a piston 
coupling constant of 8.4 Hz for the signal at position 1, 
suggesting that two pyranoses are in p-form; b) the spin 

15 coupling constants given in Figure 3 indicate that A shows 
a spin coupling constant characteristic of glucose, while B 
shows a spin coupling constant characteristic of galactose; 
c) it is A that is linked to the benzyl because NOE was 
observed between methylene proton of the benzyl and Al 

20 proton; d) there are two signals resulting from the methyl 
of N-acetyl and hence both residues are identified as 
N-acetylated sugars; and e) NOESY indicates the presence of 
NOE in B1-A3. 

On the other hand, examination was also performed on 
2 5 motif sequences involved in the above enzymatic activity. 

Figure 5 shows the putative amino acid sequence of 
the G34 protein (SEQ ID NO: 2) compared with the amino acid 
sequences of various human pi-3Gal transferases (p3Gal-Tl 
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to -T6). In Figure 5, the boxed regions indicate the 
motifs common to Gal transferases. Among them, three 
motifs indicated with Ml to M3 are common to pl,3-linking 
glycosyltransf erases . In this figure, the amino acid 
residues indicated with * are conserved among the compared 
sequences . 

Figure 6 shows a comparison of three motifs involved 
in the ability to form pi, 3 linkages (corresponding to the 
Ml to M3 motifs in Figure 5) among various pl-3GlcNAc 
transferases (p3Gn-T2 to -T5) and human Gal transferases Tl 
to T3, T5 and T6 . In this figure, the amino acid residues, 
indicated with * are conserved among the compared sequences. 

As shown in Figures 5 and 6 , it was indicated that 
the amino acid sequence of the G34 protein was conserved 
enough to have all the motifs (Ml to M3 ) involved in |3l,3 
linkages, upon comparison with the amino acid sequences of 
known various pi , 3 -linking glycosyltransf erases . 

Thus, this motif examination also supported the 
conclusion that the G34 protein has the ability to transfer 
GalNAc to GlcNAc with pi, 3 glycosidic linkage. 
Optimum buffer and optimum pH 

The following reaction system was used for examining 
the optimum buffer and pH for the GalNAc transferase 
activity of G34. The acceptor substrate used was pNp-p- 
GlcNAc. 

Any one of the following buffers was used 
(final concentrations in parentheses): MES (2- 
morpholinoethanesulf onic acid) buffer (pH 5.5, 5.78, 6.0, 
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6.5 and 6.75, 50 mM) , sodium cacodylate buffer (pH 5.0, 5.6, 
6.0, 6.2, 6.6, 6.8, 7.0, 7.2, 7.4 and 7.5, 25 mM) and N-[2- 
hydroxyethyl]piperazine-N' - [ 2-ethanesulf onic acid] (HEPES ) 
buffer (pH 6.75, 7.00, 7.30, 7.40 and 7.50, 14 mM) . The 
5 substrate (10 nmol) , MnCl 2 (10 mM) , Triton CF-54 (trade 

name) (0.3%), UDP-GalNAc (2 mM) and UDP- [ 14 C] GlcNAC (40 nCi) 
were mixed and supplemented with 5 (xl G34 enzyme solution, 
followed by dilution with H 2 0 to a total volume of 20 

The above reaction mixtures were each reacted at 37°C 

10 for 16 hours. After completion of the reaction, 200 ^1 H 2 0 
was added and each mixture was lightly centrifuged to 
obtain the supernatant. The supernatant was passed through 
a Sep-Pak plus C18 Cartridge (Waters), which had been 
washed once with 1 ml methanol and twice with 1 ml H 2 0 and 

15 then equilibrated, to allow the substrate and product in 

the supernatant to adsorb to the cartridge. After washing 
the cartridge twice with 1 ml H 2 0, the adsorbed substrate 
and product were eluted with 1 ml methanol. The eluate was 
mixed with 5 ml liquid scintillator ACSII (Amersham 

20 Biosciences) and measured for the amount of radiation with 
a scintillation counter (Beckman Coulter) . 

As indicated by the results (see Table 3 and Figure 
7), in MES buffer, G34 showed the same strong activity 
around pH 5.50 and pH 5.78 within the examined range and 

25 its activity decreased in a pH-dependent manner until pH 
6.5, but became strong again at pH 6.75. In sodium 
cacodylate buffer, the activity was highest at pH 5.0 
within the examined range and the activity decreased in a 
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pH-dependent manner until pH 6.2, increased in a pH- 
dependent manner until pH 7.0, and then plateaued until pH 
7.4. In HEPES buffer, the activity increased in a pH- 
dependent manner and reached the highest value at pH 7.4 to 
5 7.5 within the examined range. Among them, HEPES buffer at 
pH 7.4 to 7.5 resulted in the strongest activity. 

Table 3 



x n 


4- 




OUUXU1II L/CH> uu y -LCI u tJ 




OA 9 

U U t 


i. U T 


JO JO 


«J> • o 


3 "3 S *3 
j j j j 


X «J> .7 


3194 


6 0 


2689 


260 


2429 


6 . 2 


907 


138 


769 


6.6 


1093 


136 


957 


6.8 


2488 


258 


2230 


7.0 


4965 


259 


4706 


7.2 


4377 


309 


4068 


7.4 


4930 


304 


4626 


PH 


+ 




MES 


5 . 50 


3735 


197 


3538 


5.78 


3755 


184 


3571 


6 . 00 


2514 


141 


2373 


6.50 


1981 


734 


1247 


6 . 75 


3289 


136 


3153 


PH 


+ 




HEPES 


6 . 75 


4894 


149 


4745 


7 . 00 


4912 


121 


4791 


7.30 


4294 


127 


4167 


7 . 40 


6630 


120 


6510 


7 . 50 


6895 


240 


6655 



The following reaction system was used for examining 
10 the divalent ion requirement. The acceptor substrate used 
was Bz-p-GlcNAc. 
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The reaction solution (final concentrations in 
parentheses) was prepared by adding the substrate (10 nmol) , 
HEPES buffer (pH 7.4, 14 iriM) , Triton CF-54 (trade name) 
(0.3 %), UDP-GalNAc (2 mM) # UDP- [ 14 C] GlcNAC (40 nCi) and 
5 5 [xl G34 enzyme solution and further adding MnCl 2/ MgCl 2 or 
CoCl 2 at 2.5 mM, 5 mM, 10 mM, 20 mM or 40 mM, followed by 
dilution with H 2 0 to a total volume of 20 |xl . 

The above reaction mixture was reacted at 3 7°C for. 
16 hours. After completion of the reaction, 200 jxl H 2 0 was 

10 added and the mixture was lightly centrifuged to obtain the 
supernatant. The supernatant was passed through a Sep-Pak 
plus C18 Cartridge (Waters), which had been washed once 
with 1 ml methanol and twice with 1 ml H 2 0 and then 
equilibrated, to allow the substrate and product in the 

15 supernatant to adsorb to the cartridge. After washing the 
cartridge twice with 1 ml H 2 <3, the adsorbed substrate and 
product were eluted with 1 ml methanol. The eluate was 
mixed with 5 ml liquid scintillator ACSII (Amersham 
Biosciences) and measured for the amount of radiation with 

20 a scintillation counter ( Beckman Coulter). 

The results (see Table 4 and Figure 8) indicated that 
the activity was enhanced by the addition of each divalent 
ion and confirmed that the G34 protein was an enzyme 
requiring divalent ions. Its activity nearly plateaued at 

2 5 5 nM or higher concentration of Mn or Co and at 10 nM or 
higher concentration of Mg. Moreover, the Mn-induced 
enhancement of the activity was completely eliminated by 
addition of Cu. 



Table 4 

RI assay (divalent ion requirement) 



Metal ion 


Concentration (mM) 


DPM 


Mn 


2.5 


7260.09 


5 


8270. 23 


10 


7748. 77 


20 


7515. 86 


40 


4870.48 


40 


371.53 


Co 


2 . 5 


10979. 99 


5 


9503. 91 


10 


10979. 99 


20 


8070. 47 


40 


7854. 92 


Mg 


2.5 


4800. 03 


5 


8692. 15 


10 


8980. 5 6 


20 


6726. 32 


40 


5592. 88 


none 




2427. 39 


EDTA 


20 


149. 32 


I Mn+Cu 


10 + 10 


239 


none 




155. 64 



Substrate specificity to oligosaccharides 
5 The following reaction system was used for examining 

the acceptor substrate specificity to oligosaccharides. 
The acceptor substrates used were pNp-a-Gal, oNp-p-Gal, Bz- 
a-GlcNAc, Bz-|3-GlcNAc, Bz-a-GalNAc, pNp-(3-GalNAc , pNp-cx-Glc, 
pNp-p-Glc, pNp-p-GlcA, pNp-a-Fuc, pNp-a-Xyl, pNp-|3-Xyl, 
10 pNp-a-Man, lactoside-Bz , Lac-ceramide , Gal-ceramide , 

paragloboside, globoside, Gal-|3l-4 GalNAc-a-pNp , Gal-{3l-3 
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GlcNAc - (3 -pNp, GlcNAc-pl-4 GlcNAc p-Bz, pNp-corel (Gal-(3l-3 
GalNAc-a-pNp) , pNp-core2 (Gal-|3l-3 (GlcNAc-|3l-6 ) GalNAc-a- 
pNp), pNp-core3 (GlcNAc-|3l-3 GalNAc-a-pNp) and pNp-core6 
(GlcNAc-|3l-6 GalNAc-ct-pNp) . "Lac" represents a D-lactose 
residue . 

Each reaction solution (final concentrations in 
parentheses) was prepared by adding each substrate (50 
nmol), HEPES buffer (pH 7.4, 14 mM) , Triton CF-54 (trade 
name) (0.3 %), UDP-GalNAc (2 mM) , MnCl 2 (10 mM) , UDP- 
[ 3 H] GlcNAc and 5 fxl G34 enzyme solution, followed by 
dilution with H 2 0 to a total volume of 20 jxl. 

The above reaction mixtures were each reacted at 3 7°C 
for 2 hours. After completion of the reaction, 200 |jil H 2 0 
was added and each mixture was lightly centrifuged to 
obtain the supernatant. The supernatant was passed through 
a Sep-Pak plus C18 Cartridge (Waters), which had been 
washed once with 1 ml methanol and twice with 1 ml H 2 0 and 
then equilibrated, to allow the substrate and product in 
the supernatant to adsorb to the cartridge. After washing 
the cartridge twice with 1 ml H 2 0, the adsorbed substrate 
and product were eluted with 1 ml methanol. The eluate was 
mixed with 5 ml liquid scintillator ACSII (Amersham 
Biosciences) and measured for the amount of radiation with 
a scintillation counter (Beckman Coulter) . 

The results thus measured were compared assuming that 
the radioactivity obtained using Bz-p-GlcNAc as a substrate 
was set to 100% (see Table 5). When used as a substrate, 
pNp-core2 showed the largest increase in radioactivity. 
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Bz-|3-GlcNAc, GlcNAc-pl-4-GlcNAc-p-Bz , pNp-core6 and pNp- 
core3 also showed increases in radioactivity in the order 
named. The other substrates showed no increase in 
radioactivity. 

Table 5 



No. 


Acceptor substrate 


% 


1 


dNd - a- Gal 


lsj n 

IN • U m 


2 


oNn-B-Gal 


IN • U m 


3 


Bz -a-GlcNAn 




4 


Bz - 6 - GlcNAc 




5 




IN • D • 


6 


dNd - 6 - GalNAc 


IN • i/ • 


7 


pNp-a-Glc 


n n 


8 


pNp-(3-Glc 


IN • XJ m 


9 


pNp-p-GlcA 




10 


pNp-a-Fuc 


N . D . 


11 


pNp-a-Xyl 


N.D. 


12 


pNp-p-Xyl 


N.D. 


13 


pNp-a-Man 


N.D. 


14 


Lactoside-Bz 


N.D. 


15 


Lac - ceramide 


N.D. 


16 


Gal - ceramide 


N.D. 


17 


Paragloboside 


N.D. 


18 


Globoside 


N.D. 


19 


Gal(3l-4GalNAc-a-pNp 


N.D. 


20 


Gal|3 1 - 3 GlcNAc - p -pNp 


N.D. 


21 


GlcNAcpl-4GlcNAc-p-Bz 


29 


22 


corel -pNp 


N.D. 


23 


core2-pNp 


185 


24 


core3-pNp 


8 


25 


core6-pNp 


19 



N.D.: Not determined due to no radioactivity 
corel: Gal-pl-3-GalNAc-ct-pNp 
core2 : Gal-pi-3- (GlcNAc-pl -6 )GalNAc-ot-pNp 
core3: GlcNAc-pl-3-GalNAc-a~pNp 
core6: GlcNAc-pi-6-GalNAc-a-pNp 



(2) Confirmation of activity by HPLC analysis 

Using uridine diphosphate-N-acetylgalactosamine (UDP- 
GalNAc; Sigma- Aldrich Corporation) as a sugar residue donor 
substrate and Bz-p-GlcNAc as a sugar residue acceptor 
substrate, the enzymatic activity of G34 was analyzed by 
high performance liquid chromatography (HPLC). 

The reaction solution (final concentrations in 
parentheses) was prepared by adding Bz-|3-GlcNAc (10 nmol) , 
HEPES buffer (pH 7.4, 14 mM) , Triton CF-54 (trade name) 
(0.3 %), UDP-GalNAc (2 mM) , MnCl 2 (10 mM) and 10 ^1 G34 
enzyme solution, followed by dilution with H 2 0 to a total 
volume of 20 This reaction solution was reacted at 37°C 

for 16 hours. The reaction was stopped by addition of H z O 
(100 ill) and the reaction solution was purified by 
filtration through an Ultrafree-MC (Millipore Corporation). 

The filtrate (10 j*l) was analyzed by high performance 
liquid chromatography (HPLC) using a reversed-phase column 
ODS-80Ts QA (4.6 x 250 mm, Tosoh Corporation, Japan). The 
developing solvent used was an aqueous 9% acetonitrile-0 . 1% 
trifluoroacetic acid solution. The elution conditions were 
set to 1 ml/minute at 40°C. Absorbance at 210 nm was used 
as an index for elution peak detection using an SPD-lOAvp 
(Shimadzu Corporation, Japan). 

As a result, a new elution peak was observed, which 
was not detected in the control. 

(3) Ana'lysis of reaction product by mass spectrometry 

The above peak was collected and the reaction product 
was analyzed by mass spectrometry. Matrix-associated laser 
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desorption ionization- time of flight/mass spectrometry 
(MALDI -TOF-MS ) was performed using a Reflex IV (Bruker 
Daltonics). The sample at 10 pmol was dried and dissolved 
in 1 fxl distilled water for use as a MALDI-TOF-MS sample. 

As a result, a peak at 538.194 m/z was observed. 
This peak corresponded to the molecular weight of GalNAc- 
GlcNAc-Bz (sodium salt). 

This result also indicated that the G34 enzyme 
protein transfers GalNAc to Bz-p-GlcNAc. 

Example 3: Measurement for mRNA expression level of human 
G34 

(1) Expression levels in various human normal tissues 

Quantitative real-time PCR was used for comparing the 
mRNA expression levels of G34 in human normal tissues. 
Quantitative real-time PCR is a PCR method using a sense 
primer and an antisense primer in combination with a 
f luorescently-labeled probe. When a gene is amplified by 
PCR, a fluorescent label of the probe will be released to 
produce fluorescence. The fluorescence intensity is 
amplified in correlation with gene amplification and thus 
used as an index for quantification. 

RNA of each human normal tissue (Clontech) was 
extracted with an RNeasy Mini Kit (QIAGEN) and converted 
into single strand DNA by the oligo(dT) method using a 
Super-Script First-Strand Synthesis System (Invitrogen 
Corporation). This DNA was used as a template and 
subjected to quantitative real-time PCR in an ABI PRISM 
7700 (Applied Biosystems Japan Ltd.) using a 5 ' -primer (SEQ 
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ID NO: 14), a 3 '-primer (SEQ ID NO: 15) and a TaqMan probe 
(SEQ ID NO: 16). PCR was performed under conditions of 50°C 
for 2 minutes and 95°C for 10 minutes, and then under 
conditions of 50 cycles of 95°C for 15 seconds and 60°C for 
1 minute. To prepare a calibration curve, plasmid DNA 
obtained by introducing a partial sequence of G34 into 
pFL AG - CMV3 (Invitrogen Corporation) was used as a template 
and subjected to PCR as described above. 

The results confirmed that high-level expression was 
observed specifically in the testis, followed by skeletal 
muscle and prostate in the order named (Table 6). 
Table 6 



G34 mRNA expression levels in human normal tissue 


Tissue 


Copy number 
(xlOOOO/ug, total RNA) 


Standard error 


Brain 


5.0 


1.1 


Fetal brain 


10.3 


1 0.7 


Cerebellum 


2.8 


0.3 


Medulla oblongata 


4.9 


0.3 


Submandibular gland 


6.7 


0.4 


Thyroid gland 


1.8 


0.6 


Trachea 


3.9 


0.3 


Lung 


0.4 


0.1 


Heart 


0.1 


0.1 


Skeletal muscle 


25.8 


1.1 


Small intestine 


5.1 


0.3 


Large intestine (colon) 


_ 0.6 


0.3 


Liver 


0.3 


0.1 


Fetal liver 


0.7 


0.3 


Pancreas 


4.2 


1.1 


Kidney 


1.6 


0.3 


Adrenal gland 


10.8 


1.3 


Thymus 


4.8 


0.2 


Bone marrow 


3.1 


0.4 


Spleen 


4.2 


0.3 


Testis 


115.5 


2.0 


Prostate 


14.6 


1.5 


Mammary gland 


5.2 


0.2 


Uterus 


5.0 


0.2 


Placenta 


1.4 


0.4 
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(2) Expression levels in human cancer cell lines 

Quantitative real-time PCR as mentioned above was 
used for comparing the mRNA expression levels of G34 in 
various cancer- derived human cell lines. After cells of 
each human cell line were collected, RNA was extracted with 
an RNeasy Mini Kit (QIAGEN) and converted into single 
strand DNA by the oligo(dT) method using a Super- Script 
First-Strand Synthesis System (Invitrogen Corporation). 
This DNA was used as a template and subjected to 
quantitative real-time PCR in an ABI PRISM 7700 (Applied 
Biosystems Japan Ltd.) using a 5' -primer (SEQ ID NO: 14), a 
3 '-primer (SEQ ID NO: 15) and a TaqMan probe (SEQ ID NO: 
16 ) . PCR was performed under conditions of 50°C for 2 
minutes and 95°C for 10 minutes, and then under conditions 
of 50 cycles of 95°C for 15 seconds and 60°C for 1 minute. 

As a result, the expression was observed in all the 
human cell lines (Table 7, Figure 9). 
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Table 7 



G34 mRNA expression levels in human cell lines 







Copy 

1 1 Ulll utsx 






Copy 
number 








Cell 


(Xl0 4 /UQ 

total 






/ vi r\ * / ii/t 

v xxu / M-g 
total 








line 


RNA) 




Cell line 


RNA) 








SCCH-26 


7.9 


0.6 




ESI 


23.0 


2.5 




NAG A I 


19.5 


1.5 


Esophageal 


ES2 


16.1 


0.6 


iNeuro - 
blastoma 


NB-9 


40.6 


2.3 


cancer 


ES6 


42.8 


3.0 


SK-N-SH 


14 .9 


0.7 




MKN1 


6. 2 


1 . 1 


SK-N-MC 


5.8 


0.5 




MKN28 


8. 6 


1. 0 




NB-1 


20 .9 


0.5 




MKN7 


9. 7 


0 . 1 




IMR32 


21.0 


0.2 


Gastric 


MKN74 


3 . 5 


0 . 8 




T98G 


6.2 


0.2 


cancer 


MKN-45 


7 . 3 


2 . 1 




YKG-1 


3.9 


0.0 




HSC-43 


42 . 8 


1 . 7 




A172 


13.4 


0.9 




KATOIII 


6 . 4 


0 . 4 


Glioma 


GI-1 


13.7 


1.3 




TMK-1 


10 . 8 


1 . 2 




U118MG 


6.8 


0.5 




LSC 


11. 8 


0 . 6 




U251 


28.9 


1.9 




LSB 


4 . 9 


0.3 




KG-l-C 


9.1 


0.6 




SW480 


10. 1 


0 . 4 




LU130 


6.8 


0.4 


Large 


SW1116 


24 . 1 


1 . 4 




Lul34A 


30 .3 


1.2 


intestine 


Colo201 


10. 4 


0 . 4 




LU134B 


6.8 


0.4 


(colon) 


Colo205 


6 . 8 


0 . 9 




Lul35 


7.2 


1.3 


cancer 


CI 


21. 9 


1.2 




Lul39 


10.7 


0.5 




WiDr 


1.2 


0.0 




Lul40 


15.4 


1.8 




HCT8 


82.2 


6.2 




SBC-1 


2.5 


0.2 




HCT15 


12. 1 


1.0 




PC-7 


9.1 


0.2 




A204 


67.9 


4.4 




PC-9 


22.4 


0.1 




A-431 


30.6 


2.5 


Lung 


KAL-8 


15.2 


1.2 




SW1736 


11.9 


1.1 


cancer . 


1AL-24 


20.8 


1.7 




KepG2 


2.3 


0.3 




M3C-1 


10.3 


0.9 


Others 


-apan-2 


19.4 


1.2 


I 


*ERF-LC- 














t 


4C 


22 . 8 


2.2 




293T 


55.1 


8.3 


I 


2HHA-9 


20.3 


7.9 


3 


3 A-1 


3.5 


0.6 


I 


>C-1 


2.1 


0.2 




IL-60 


2.1 


0.1 


I 


:bc-i 


4.4 


0.2 


Leukemia - 
} 


C-562 


17.1 


1.8 


I 


>C-10 


118.8 


4.9 


I 


)audi 


2.4 


0.2 


; 


k549 


27.1 


2.6 




Jamalwa 


13.0 


1.2 


i 


,X-1 


30.7 


2.1 


¥ 


;hm-ib 


16.4 


0.4 








Lymphoma 

F 


tamos 


9.5 


0.7 










.aji 


11. 6 


1.3 








J 


urkat 


42.7 


1.9 
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(3) Expression levels in cancerous tissues 

Quantitative real-time PCR as mentioned above was 
used for comparing the mRNA expression levels of G34 in 
cancer tissues and their surrounding normal tissues derived 
from patients with large intestine (colon) cancer and lung 
cancer . 

From cancer and normal tissues of the same patient, 
RNA was extracted with an RNeasy Mini Kit (QIAGEN) and 
converted into single strand DNA by the oligo(dT) method 
using a Super-Script First-Strand Synthesis System 
(Invitrogen Corporation). This DNA was used as a template 
and subjected to quantitative real-time PCR in an ABI PRISM 
7700 (Applied Biosystems Japan Ltd.) using a 5 ' -primer (SEQ 
ID NO: 14), a 3 ' -primer (SEQ ID NO: 15) and a TaqMan probe 
(SEQ ID NO: 16). PCR was performed under conditions of 50 
cycles of 50°C for 2 minutes, 95°C for 10 minutes, 95°C for 
15 seconds, and 60°C for 1 minute. To correct variations 
among individuals, the resulting data were divided by the 
value of (3-actin (internal standard gene) quantified using 
a kit of Applied Biosystems Japan before being compared. 

The results indicated that the mRNA expression level 
of the G34 gene was significantly increased in these 
cancerous tissues (Table 8, Table 9). 
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Table 8 



G34 mRNA expression levels in tissues 
from large intestine cancer patients 



ra Lien l 
No. 


Normal 
tissue 


Standard 
error 


Cancer 
tissup 


Standard 
tij. -L or 


%Change 


1 


0 . 15 


0.04 


0.35 


0.07 


2.3 


2 


0 . 15 


0. 07 


8.63 


0. 65 


58.0 


3 


0.07 


0. 02 


1.55 


0. 15 


23.5 


4 


0.08 


0. 05 


1.82 


0.26 


22.0 


5 


0.08 


0. 02 


0. 60 


0 . 07 


7.2 


6 


1 1.04 


0. 08 


1.92 


0.21 


1.8 


7 


0.07 


0. 02 


5.37 


1.06 


81.3 


8 


1.54 


0.27 


8. 30 


0. 96 


5.4 


9 


0. 05 


0. 04 


1. 70 


0.37 


34.3 


10 


0.05 


0. 04 


0. 10 


0.04 


2.0 


11 


0. 60 


0.29 


10.23 


1.47 


17.2 


12 


0.17 


0.13 


2.36 


0.43 


14.3 


13 


0 . 18 


0.09 


1.70 


0.27 


9.4 


14 


0 . 18 


0. 08 


2.76 


0.23 


15.2 


15 


0 . 18 


\J * \J o 


o .49 


0 . 34 


19.2 


16 


0.20 


0 . 15 


1.84 


0.25 


9.3 


17 


0.28 


0 . 05 


7.41 


0.51 


26.4 


18 | 


0.05 


0.04 


5.92 


0. 38 


119.3 


19 


0. 15 


0.11 


4.68 


0.67 


31.4 i 


20 


0. 13 


0.06 


4.61 


2.22 


34.9 


21 


0.02 


0.02 


8.40 


1.65 


508.0 


22 


0.20 


0.07 


3.57 


0.43 


18.0 


23 


0.55 


0.27 


2.33 


1.23 


4.3 


Average 


0. 25 


0.07 


3.97 


0.55 


15.6 



Copy number (xlOOOO/ng. total RNA) 

5 
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Table 9 



G34 mRNA expression levels in tissues 
from lung cancer patients 



Patient 
No . 


Normal 
tissue 


Standard 
error 


Cancer 
tissue 


Standard 
error 




1 


0.48 


0.06 


2. 03 


0. 27 


4.2 


3 


0.00 


0.00 


0.55 


0.21 


- 


4 


2.43 


0.40 


6.13 


0. 17 


2.5 


5 


0. 10 


0. 04 


2.74 


0.32 


27.7 


6 


1. 69 


0.28 


3.11 


0. 69 


1.8 


7 


0. 60 


0.16 


2. 76 


0.35 


4.6 


8 


2. 30 


0.38 


6.23 


0.21 


2.7 


9 


1. 26 


0.27 


2.51 


0.10 


2.0 


10 


1. 47 


0. 18 


4.76 


0.57 


3.2 


11 


0. 64 


0. 00 


1.14 


0.11 


1.8 


12 


0. 56 


0.06 


0.69 


0.04 


1.2 


13 


1. 32 


0. 02 


1.98 


0. 15 


1.5 


14 


0. 17 


0.02 


0.66 


0.02 


4.0 


15 


0. 71 


0.05 


2.71 


0. 13 


3.8 


16 


1. 07 


0.13 


15.64 


1.11 


14.6 


17 


1.03 


0.12 


8.27 


0. 73 


8.1 


18 


0. 13 


0.02 


1.95 


0. 09 


14.8 


Average 


0. 94 


0.71 


3.76 


3. 64 


4.0 



Copy number (xlOOOO/ng, total RNA) 



Example 4: Cloning and expression of mouse G34 gene 

The human G34 sequence obtained in Example 1 was used 
as a query for a search against the mouse gene sequence 
serela (Applied Biosystems) to thereby find a corresponding, 
nucleic acid sequence with high homology. The open reading 
frame (ORF) estimated from this nucleic acid sequence is 
composed of 1515 bp ( SEQ ID NO: 3), i.e., 504 amino acids 
(SEQ ID NO: 4) when calculated as an amino acid sequence, 
and has a hydrophobic amino acid region characteristic of 
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glycosyltransf erases at its N- terminal end. This sequence 
shares a homology of 86% (nucleic acid sequence) and 88% 
(amino acid sequence) with human G34 (SEQ ID NOs: 1 and 2) 
(see Figure 10), Moreover, the sequence retains all of the 
three motifs conserved in the (33GalT family. The product 
encoded by the nucleic acid sequence of SEQ ID NO: 3 and 
the amino acid sequence of SEQ ID NO: 4 was designated 
mouse G34 (mG34) . 

To examine the activity of mG34, G34 was allowed to 
be expressed in a mammalian cell line. In this example, 
the active region covering amino acid 35 to the C-terminal 
end of mG34 was genetically introduced into a mammalian 
cell line expression vector pFLAG-CMV3 using a FLAG Protein 
Expression system ( Sigma-Aldrich Corporation). 

The expression in mouse tissues was confirmed by PCR. 
Each mouse tissue' (brain, thymus, stomach, small intestine, 
large intestine (colon), liver, pancreas, spleen, kidney, 
testis or skeletal muscle) was used as a template and 
subjected to PCR using a 5 ' -primer (mG34 -CMV-F1 ; SEQ ID 
NO: 17) and a 3 ' -primer (mG34-CMV-Rl ; SEQ ID NO: 18) . PCR 
was performed under conditions of 25 cycles of 98°C for 
10 seconds, 55°C for 30 seconds, and 72°C for 2 minutes. 
The PCR product was electrophoresed on an agarose gel to 
confirm a band of approximately 1500 bp. As a result, as 
shown in Table 10, the expression level was highest in the 
testis, followed by spleen and skeletal muscle in the order 
named. 
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Table 10 

mG34 mRNA expression levels in mouse tissues 



Tissue 


Expression level 


Brain 


± 


Thymus 


- 


Stomach 


+ 


Small intestine 




Large intestine 
(colon) 


+ 


Liver 


+ 


Pancreas 




Spleen 




Kidney 


+ + 


Testis 


+ + + 


Skeletal muscle 


+ + 



Mouse testis -derived cDNA was used as a template and 
5 subjected to PCR using a 5 ' -primer (mG34 -CMV-F1 ; SEQ ID 
NO: 17) and a 3 ' -primer (mG34 -CMV-R1 ; SEQ ID NO: 18) to 
obtain a DNA fragment of interest. PCR was performed under 
conditions of 25 cycles of 98°C for 10 seconds, 55°C for 
30 seconds, and 72°C for 2 minutes. The PCR product was 
10 then electrophoresed on an agarose gel and isolated in a 
standard manner after gel excision. This PCR product has 
restriction enzyme sites Hindlll and NotI at the 5 r and 3' 
sides , respectively . 

After this DNA fragment and pFLAG-CMV3 were each 
15 treated with restriction enzymes Hindlll and NotI, the 
reaction solutions were mixed together and subjected to 
ligation reaction, so that the DNA fragment was introduced 
into pFLAG-CMV3. The reaction solution was purified by 
ethanol precipitation and then mixed with competent cells 
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(E. coll DH5a) . After heat shock treatment (42°C, 30 
seconds), the cells were seeded on ampicillin-containing LB 
agar medium . 

On the next day, the resulting colonies were 
confirmed by direct PCR for the DNA of interest. For more 
reliable results, after sequencing to confirm the DNA 
sequence, the vector (pFLAG-CMV3-mG34A) was extracted and 
purified. 

Human kidney cell-derived cell line 293T cells (2 x 
10 6 ) were suspended in 10 ml antibiotic-free DMEM medium 
(Invitrogen Corporation) supplemented with 10% fetal bovine 
serum, seeded in a 10 cm dish and cultured for 16 hours at 
37°C in a C0 2 incubator. pFLAG-CMV3 -mG34A (20 ng) and 
Lipof ectamin 2000 (30 |xl, Invitrogen Corporation) were each 
mixed with 1 . 5 ml OPTI-MEM (Invitrogen Corporation) and 
incubated at room temperature for 5 minutes. These two 
solutions were further mixed gently and incubated at room 
temperature for 20 minutes. This mixed solution was added 
dropwise to the dish and cultured for 48 hours at 3 7°C in a 
C0 2 incubator. 

The supernatant (10 ml) was mixed with NaN 3 (0.05 %) , 
NaCl (150 mM) , CaCl 2 (2 mM) and anti-Mi resin (100 
SIGMA), followed by overnight stirring at 4°C. On the next 
day, the supernatant was centrifuged (3000 rpm, 5 minutes, 
4°C) to collect a pellet fraction. After addition of 2 mM 
CaCl 2 -TBS (900 \xl) , centrif ugation was repeated (2000 rpm, 
5 minutes, 4°C) and the resulting pellet was suspended in 
200 nl of 1 mM CaCl 2 -TBS for use as a sample for activity 
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measurement (mouse G34 enzyme solution). A part of this 
sample was electrophoresed by SDS-PAGE and Western blotted 
using anti-FLAG M2-peroxidase (SIGMA) to confirm the 
expression of the mG34 protein of interest. As a result, a 
band was detected at a position of about 60 kDa, thus 
confirming the expression of the mG34 protein. 
Example 5: Search for glycosyltransf erase activity of mouse 
G34 

The following reaction system was used for examining 
mouse G34 for its substrate specificity in its pi,3-N- 
acetylgalactosamine transferase activity. In the reaction 
solutions shown below, each of the following was used at 
10 nmol as an "acceptor substrate": pNp-a-Gal, oNp-p-Gal, 
Bz-a-GlcNAc, Bz-0-GlcNAc, Bz-a-GalNAc, pNp-p-GalNAc , pNp- ce- 
de, pNp-p-Glc, pNp-|3-GlcA, pNp-a-Fuc, pNp-a-Xyl, pNp-p-Xyl, 
pNp-a-Man, lactoside-Bz , Lac-ceramide, Gal-ceramide, Gb3, 
globoside, Gal-(3l- 4GalNAc-a-pNp , Galpl- 3GlcNAc-|3-Bz , 
GlcNAc - p 1 - 4 - GlcNAc- 13 -Bz , corel -pNp , core2 -pNp , core3 -pNp 
and core6-pNp (all purchased .from SIGMA). 

Each reaction solution was prepared as follows (final 
concentrations in parentheses): each substrate (10 nmol), 
HEPES ( N- [ 2 -hydroxy ethyl ] piperazine-N 9 - [ 2 -ethanesulf onic 
acid]) (pH 7.4, 14 mM) , MnCl 2 (10 mM) , Triton CF-54 (trade 
name) (0.3 %), UDP-GalNAc (2 mM) and UDP- [ 14 C]GlcNAC (40 
nCi) were mixed and supplemented with 5 jxl mouse G34 enzyme 
solution, followed by dilution with H 2 0 to a total volume 
of 20 nl. 

The above reaction mixtures were each reacted at 37°C 
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for 16 hours. After completion of the reaction, 200 \il H 2 0 
was added and each mixture was lightly centrifuged to 
obtain the supernatant. The supernatant was passed through 
a Sep-Pak plus C18 Cartridge (Waters), which had been 
5 washed once with 1 ml methanol and twice with 1 ml H 2 0 and 
then equilibrated, to allow the substrate and product in 
the supernatant to adsorb to the cartridge. After washing 
the cartridge twice with 1 ml H 2 0, the adsorbed substrate 
and product were eluted with 1 ml methanol. The eluate was 

10 mixed with 5 ml liquid scintillator ACSII (Amersham 

Biosciences) and measured for the amount of radiation with 
a scintillation counter (Beckman Coulter) . 

The results thus measured were compared assuming that 
the radioactivity obtained using Bz-p-GlcNAc as a substrate 

15 was set to 100% (Table 11). When used as a substrate, 

Bz-p-GlcNAc showed the largest increase in radioactivity. 
core2-pNp, core6-pNp, core3-pNp, pNp-p-Glc and GlcNAc-|3l-4- 
GlcNAc-p-Bz also showed high radioactivity in the order 
named. The other substrates showed no increase in 

20 radioactivity. 
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Table 11 



Acceptor suDSLiaie 


% 


pNp-a-Gal 


ND 


oNp-P-Gal 


ND 


Bz-a-GlcNAc 


ND 


Bz-p-GlcNAc 


100 


Bz-a-GalNAc 


ND 


pNp-p-GalNAc 


ND 


pNp-a-Glc 


ND 


pNp-P-Glc 


12 


pNp-p-GlcA 


ND 


pNp-a-Fuc 


ND 


pNp-a-Xyl 


ND 


pNp-p-Xyl 


ND 


pNp-a-Man 


ND 


Lactoside-Bz 


ND 


Lac- ceramide 


ND 


Gal-ceramide 


ND 


Gb3 


ND 


GloDOSide 


ND 


Galp 1 - 4 GalNAc - a - pNp 


ND 


Gaip 1 - 3 GlcNAc - p - pNp 


ND 


GlcNAcP 1 - 4GlcNAc - p -Bz 


10 


corel -pNp 


ND 


core2-pNp | 


25 


core3-pNp 


14 


core6-pNp 


18 



Example 6: In situ hybridization on mouse testis 
5 In situ hybridization using mG34 was performed on a 

mouse testis-derived sample to confirm the expression of 
mG34 in the mouse testis sample (see Figure 11). 
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Example 7; Creation of G34 knockout mouse 

A targeting vector (pBSK-mG34-KOneo ) is constructed 
in which pBluescript II SK(-) (TOYOBO) is inserted with a 
chromosomal fragment (about 10 kb) primarily composed of an 
5 approximately 10 kb fragment covering exons (i.e., Exons 3 
to 12 (1242 bp) within the ORF region of mG34) containing 
activation domains of the gene (mG34) to be knocked out. 
pBSK-mG3 4-KOneo is also designed to have the drug 
resistance gene neo (neomycin resistance gene) introduced 

10 into Exons 7 to 9 which are putative GalNAc transferase 

active regions of mG34. As a result, Exons 7 to 9 of mG34 
are deleted and replaced by neo. The pBSK-mG34-KOneo thus 
obtained is linearized with a restriction enzyme NotI, 80 
tig of which is then transfected (e.g., by electroporation ) 

15 into ES cells (derived from E14/129Sv mice) to select 

G418-resistant colonies. The G418-resistant colonies are 
transferred to 24 -well plates and then cultured. After a 
part of the cells are frozen and stored, DNA is extracted 
from the remaining ES cells and around 120 colonies of 

20 recombinant clones are selected by PCR. Further, Southern 
blotting or other techniques are performed to confirm 
whether recombination occurs as expected, finally selecting 
around 10 clones of recombinants. ES cells from two of the 
selected clones are injected into C57BL/6 mouse blastocysts. 

25 The mouse embryos injected with the ES cells are 

transplanted into the uteri of recipient mice to generate 
chimeric mice, followed by germline transmission to obtain 
heterozygous knockout mice. 



